{"id":2666,"date":"2015-11-07T00:44:00","date_gmt":"2015-11-07T00:44:00","guid":{"rendered":"https:\/\/www.htmlgoodies.com\/uncategorized\/fetch-hyperlinked-files-using-jsoup\/"},"modified":"2015-11-07T00:44:00","modified_gmt":"2015-11-07T00:44:00","slug":"fetch-hyperlinked-files-using-jsoup","status":"publish","type":"post","link":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/","title":{"rendered":"Fetch Hyperlinked Files using Jsoup"},"content":{"rendered":"<p>\n<title>Fetch Hyperlinked Files using Jsoup<\/title>\n<\/p>\n<p>In the <a href=\"http:\/\/www.htmlgoodies.com\/java\/download-linked-resources-using-jsoup\/\" target=\"_blank\" rel=\"noopener\">Download Linked Resources using Jsoup<\/a> tutorial, we learned how to select a specific hyperlink element based on a unique attribute value in order to download a linked MP3. In today&#8217;s conclusion, we&#8217;ll cover how to extract the absolute URL from the first link in the Elements Collection and save the MP3 file on our local device.<\/p>\n<h2>Retrieving the Download Link<\/h2>\n<p>Recall that in the last article we invoked the org.jsoup.nodes.Document object&#8217;s select() method to return a collection of matching Elements. Although we are using an identifier that we believe to be unique, it never hurts to check how many items were returned. If more than one link comes back, we can handle the situation. For the sake of simplicity, I elected to process the first element only.<\/p>\n<p>The handy first() shortcut method gets the element that we&#8217;re after.<\/p>\n<p>We need an absolute URL to work with because we will be making a new request to the server. To do that, we need to include the &#8220;abs:&#8221; prefix on the attribute name.<\/p>\n<pre class=\"brush:javascript\">\/\/select links whose href attribute ends with \"t=60\"\n\/\/should only be one but select returns an Elements collection\nElements links = doc.select(\"a[href$=\" + LINK + \"]\"); \/\/LINK is the unique identifier of \"t=60\"\nint linksSize = links.size();\nif (linksSize &gt; 0) {\n  if (linksSize &gt; 1) {\n      System.out.println(\"Warning: more than one link found.  Downloading first match.\");\n  }\n  Element link = links.first();\n  \/\/this returns an absolute URL\n  String  linkUrl = link.attr(\"abs:href\");\n<\/pre>\n<h2>Downloading the Linked File<\/h2>\n<p>This step is significantly more difficult that one might expect. One problem, ingeniously solved by <a href=\"http:\/\/jmchung.github.io\/blog\/2013\/10\/25\/how-to-solve-jsoup-does-not-get-complete-html-document\/\" target=\"_blank\" rel=\"noopener\">Jeremy Chung<\/a> is that Jsoup limits the file size. His solution was to set the maxBodySize property to zero.<\/p>\n<p>But that&#8217;s not the end of it. I noticed that servers sometimes check things like the referrer and userAgent to prevent third-party linking. Luckily, both those properties are easy to set. Another important setting is ignoreContentType. It is set to false by default so that an unrecognized content-type will cause an IOException to be thrown. This is to prevent Jsoup from attempting to parse binary content. In our case, we are simply using Jsoup to download the file, so we have to tell it to ignore the content type.<\/p>\n<pre class=\"brush:javascript\">\/\/Thanks to Jeremy Chung for maxBodySize solution\n\/\/http:\/\/jmchung.github.io\/blog\/2013\/10\/25\/how-to-solve-jsoup-does-not-get-complete-html-document\/\nbyte[] bytes = Jsoup.connect(linkUrl)\n   .header(\"Accept-Encoding\", \"gzip, deflate\")\n   .userAgent(\"Mozilla\/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko\/20100101 Firefox\/23.0\")\n   .referrer(URL_TO_PARSE)\n   .ignoreContentType(true)\n   .maxBodySize(0)\n   .timeout(600000)\n   .execute()\n   .bodyAsBytes();\n<\/pre>\n<h2>Validating the File Type<\/h2>\n<p>If we really want to be thorough we can check the file to make sure that it&#8217;s really an MP3. The file extension might be a good clue, but since many dynamic links such as this one don&#8217;t include the file name at all, much less an extension, we have to check the contents for indicators.<\/p>\n<pre class=\"brush:javascript\">try {\n\n    validateMP3File(bytes);\n       \n} catch (IOException e) {\n    System.err.println(\"Could not read the file at '\" + linkUrl + \"'.\");\n}\ncatch (InvalidFileTypeException e) {\n    System.err.println(\"'\" + linkUrl + \"' does not appear to point to an MP3 file.\");\n}\n<\/pre>\n<p>The InvalidFileTypeException is our own class and is defined as a private field:<\/p>\n<pre class=\"brush:javascript\">@SuppressWarnings(\"serial\")\nprivate static class InvalidFileTypeException extends Exception {}\n<\/pre>\n<h2>The validateMP3File() Method<\/h2>\n<p>MP3 files happen to have certain bytes reserved for information about the content called an <a href=\"http:\/\/id3.org\/\" target=\"_blank\" rel=\"noopener\">ID3 tag<\/a>. The following code:<\/p>\n<ol>\n<li>connects an InputStream to the byte array<\/li>\n<li>reads in the first MB of content in to a byte array<\/li>\n<li>converts the byte array into a String using the new String(Byte[] bytes) constructor<\/li>\n<li>stores the first three characters into a variable<\/li>\n<li>compares it to the &#8220;IDE&#8221; marker<\/li>\n<li>throws a new InvalidFileTypeException if the marker is not present<\/li>\n<\/ol>\n<pre class=\"brush:javascript\">public static void validateMP3File(byte[] song) throws IOException, InvalidFileTypeException {\n    InputStream file = new ByteArrayInputStream(song);\n    byte[] startOfFile = new byte[1024];\n    file.read(startOfFile);\n    String id3 = new String(startOfFile);\n    String tag = id3.substring(0, 3);\n    if  ( ! \"ID3\".equals(tag) ) {\n        throw new InvalidFileTypeException();\n    }\n}\n<\/pre>\n<h2>Saving the File<\/h2>\n<p>All that remains now is to save the file. We can use the link text to name the file. The .mp3 extension may be added if necessary. A FileOutputStream writes the bytes to the new file.<\/p>\n<pre class=\"brush:javascript\">try {\n    validateMP3File(bytes);\n                                       \n    String savedFileName = link.text();\n    if (!savedFileName.endsWith(\".mp3\")) savedFileName.concat(\".mp3\");\n    FileOutputStream fos = new FileOutputStream(savedFileName);\n    fos.write(bytes);\n    fos.close();\n\n    System.out.println(\"File has been downloaded.\");\n} catch (IOException e) {\n\/\/...    \n<\/pre>\n<p>Here is the full source for the JsoupDemoTest class:<\/p>\n<pre class=\"brush:javascript\">package com.robgravelle.jsoupdemo;\n\nimport static org.jsoup.Jsoup.parse;\n\nimport java.io.ByteArrayInputStream;\nimport java.io.FileOutputStream;\nimport java.io.IOException;\nimport java.io.InputStream;\nimport java.net.URL;\n\nimport org.jsoup.Jsoup;\nimport org.jsoup.nodes.Document;\nimport org.jsoup.nodes.Element;\nimport org.jsoup.select.Elements;\n\npublic class JsoupDemoTest {\n    private final static String URL_TO_PARSE  = \"http:\/\/robgravelle.com\/albums\/\";\n    private final static String LINK = \"t=60\";\n    @SuppressWarnings(\"serial\")\n    private static class InvalidFileTypeException extends Exception {}\n   \n    public static void main(String[] args) throws IOException {\n        \/\/these two lines are only required if your Internet\n        \/\/connection uses a proxy server\n        \/\/System.setProperty(\"http.proxyHost\", \"my.proxy.server\");\n        \/\/System.setProperty(\"http.proxyPort\", \"8081\");\n        URL url = new URL(URL_TO_PARSE);\n        Document doc = parse(url, 30000);\n       \n        Elements links = doc.select(\"a[href$=\" + LINK + \"]\");\n        int linksSize = links.size();\n        if (linksSize &gt; 0) {\n            if (linksSize &gt; 1) {\n                System.out.println(\"Warning: more than one link found.  Downloading first match.\");\n            }\n            Element link    = links.first();\n            String  linkUrl = link.attr(\"abs:href\");\n            \/\/Thanks to Jeremy Chung for maxBodySize solution\n            \/\/http:\/\/jmchung.github.io\/blog\/2013\/10\/25\/how-to-solve-jsoup-does-not-get-complete-html-document\/\n            byte[] bytes = Jsoup.connect(linkUrl)\n                .header(\"Accept-Encoding\", \"gzip, deflate\")\n                .userAgent(\"Mozilla\/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko\/20100101 Firefox\/23.0\")\n                .referrer(URL_TO_PARSE)\n                .ignoreContentType(true)\n                .maxBodySize(0)\n                .timeout(600000)\n                .execute()\n                .bodyAsBytes();\n           \n                try {\n                    validateMP3File(bytes);\n                   \n                    String savedFileName = link.text();\n                    if (!savedFileName.endsWith(\".mp3\")) savedFileName.concat(\".mp3\");\n                    FileOutputStream fos = new FileOutputStream(savedFileName);\n                    fos.write(bytes);\n                    fos.close();\n                   \n                    System.out.println(\"File has been downloaded.\");\n                } catch (IOException e) {\n                    System.err.println(\"Could not read the file at '\" + linkUrl + \"'.\");\n                }\n                catch (InvalidFileTypeException e) {\n                    System.err.println(\"'\" + linkUrl + \"' does not appear to point to an MP3 file.\");\n                }\n        }\n        else {\n            System.out.println(\"Could not find the link ending with '\" + LINK + \"' in web page.\");\n        }\n    }\n   \n    public static void validateMP3File(byte[] song) throws IOException, InvalidFileTypeException {\n        InputStream file = new ByteArrayInputStream(song);\n        byte[] startOfFile = new byte[6];\n        file.read(startOfFile);\n        String id3 = new String(startOfFile);\n        \/\/String tag = id3.substring(0, 3);\n        if  ( ! \"ID3\".equals(id3) ) {\n            throw new InvalidFileTypeException();\n        }\n    }\n   \n    \/\/validateMP3File() is based on this method\n    public static void getMP3Metadata(byte[] song) {\n        try {\n            InputStream file = new ByteArrayInputStream(song);\n            int size = (int)song.length;\n            byte[] startOfFile = new byte[1024];\n            file.read(startOfFile);\n            String id3 = new String(startOfFile);\n            String tag = id3.substring(0, 3);\n            if  (\"ID3\".equals(tag)) {\n                System.out.println(\"Title: \" + id3.substring(3, 32));\n                System.out.println(\"Artist: \" + id3.substring(33, 62));\n                System.out.println(\"Album: \" + id3.substring(63, 91));\n                System.out.println(\"Year: \" + id3.substring(93, 97));\n            } else\n                System.out.println(\"does not contain\" + \" ID3 information.\");\n            file.close();\n        } catch (Exception e) {\n            System.out.println(\"Error - \" + e.toString());\n        }\n    }\n}\n\n\n<\/pre>\n<h2>Conclusion<\/h2>\n<p>The Jsoup library offers a virtually unlimited number of applications for page scraping and resource fetching via website hyperlinks. If you&#8217;ve come up with your own creative uses for it, by all means share. It might just get featured in an up-coming article!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Fetch Hyperlinked Files using Jsoup In the Download Linked Resources using Jsoup tutorial, we learned how to select a specific hyperlink element based on a unique attribute value in order to download a linked MP3. In today&#8217;s conclusion, we&#8217;ll cover how to extract the absolute URL from the first link in the Elements Collection and [&hellip;]<\/p>\n","protected":false},"author":90,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[30624],"tags":[6499,3448],"b2b_audience":[29],"b2b_industry":[52],"b2b_product":[133,107,98],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v18.8 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Fetch Hyperlinked Files using Jsoup | HTML Goodies<\/title>\n<meta name=\"description\" content=\"Fetch Hyperlinked Files using Jsoup In the Download Linked Resources using Jsoup tutorial, we learned how to select a specific hyperlink element based on\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Fetch Hyperlinked Files using Jsoup | HTML Goodies\" \/>\n<meta property=\"og:description\" content=\"Fetch Hyperlinked Files using Jsoup In the Download Linked Resources using Jsoup tutorial, we learned how to select a specific hyperlink element based on\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/\" \/>\n<meta property=\"og:site_name\" content=\"HTML Goodies\" \/>\n<meta property=\"article:published_time\" content=\"2015-11-07T00:44:00+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@htmlgoodies\" \/>\n<meta name=\"twitter:site\" content=\"@htmlgoodies\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Rob Gravelle\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.htmlgoodies.com\/#organization\",\"name\":\"HTML Goodies\",\"url\":\"https:\/\/www.htmlgoodies.com\/\",\"sameAs\":[\"https:\/\/twitter.com\/htmlgoodies\"],\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.htmlgoodies.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/03\/HTMLg_weblogo_MobileLogo.png\",\"contentUrl\":\"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/03\/HTMLg_weblogo_MobileLogo.png\",\"width\":584,\"height\":136,\"caption\":\"HTML Goodies\"},\"image\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/#\/schema\/logo\/image\/\"}},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.htmlgoodies.com\/#website\",\"url\":\"https:\/\/www.htmlgoodies.com\/\",\"name\":\"HTML Goodies\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.htmlgoodies.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#webpage\",\"url\":\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/\",\"name\":\"Fetch Hyperlinked Files using Jsoup | HTML Goodies\",\"isPartOf\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/#website\"},\"datePublished\":\"2015-11-07T00:44:00+00:00\",\"dateModified\":\"2015-11-07T00:44:00+00:00\",\"description\":\"Fetch Hyperlinked Files using Jsoup In the Download Linked Resources using Jsoup tutorial, we learned how to select a specific hyperlink element based on\",\"breadcrumb\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.htmlgoodies.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Fetch Hyperlinked Files using Jsoup\"}]},{\"@type\":\"Article\",\"@id\":\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#webpage\"},\"author\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/#\/schema\/person\/d340101131281902e682ad0190b7ac75\"},\"headline\":\"Fetch Hyperlinked Files using Jsoup\",\"datePublished\":\"2015-11-07T00:44:00+00:00\",\"dateModified\":\"2015-11-07T00:44:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#webpage\"},\"wordCount\":559,\"publisher\":{\"@id\":\"https:\/\/www.htmlgoodies.com\/#organization\"},\"keywords\":[\"download\",\"HTML\"],\"articleSection\":[\"Java\"],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.htmlgoodies.com\/#\/schema\/person\/d340101131281902e682ad0190b7ac75\",\"name\":\"Rob Gravelle\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.htmlgoodies.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/05\/rob-gravelle-150x150.jpg\",\"contentUrl\":\"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/05\/rob-gravelle-150x150.jpg\",\"caption\":\"Rob Gravelle\"},\"description\":\"Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.\",\"url\":\"https:\/\/www.htmlgoodies.com\/author\/rob-gravelle\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Fetch Hyperlinked Files using Jsoup | HTML Goodies","description":"Fetch Hyperlinked Files using Jsoup In the Download Linked Resources using Jsoup tutorial, we learned how to select a specific hyperlink element based on","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/","og_locale":"en_US","og_type":"article","og_title":"Fetch Hyperlinked Files using Jsoup | HTML Goodies","og_description":"Fetch Hyperlinked Files using Jsoup In the Download Linked Resources using Jsoup tutorial, we learned how to select a specific hyperlink element based on","og_url":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/","og_site_name":"HTML Goodies","article_published_time":"2015-11-07T00:44:00+00:00","twitter_card":"summary_large_image","twitter_creator":"@htmlgoodies","twitter_site":"@htmlgoodies","twitter_misc":{"Written by":"Rob Gravelle","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/www.htmlgoodies.com\/#organization","name":"HTML Goodies","url":"https:\/\/www.htmlgoodies.com\/","sameAs":["https:\/\/twitter.com\/htmlgoodies"],"logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.htmlgoodies.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/03\/HTMLg_weblogo_MobileLogo.png","contentUrl":"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/03\/HTMLg_weblogo_MobileLogo.png","width":584,"height":136,"caption":"HTML Goodies"},"image":{"@id":"https:\/\/www.htmlgoodies.com\/#\/schema\/logo\/image\/"}},{"@type":"WebSite","@id":"https:\/\/www.htmlgoodies.com\/#website","url":"https:\/\/www.htmlgoodies.com\/","name":"HTML Goodies","description":"","publisher":{"@id":"https:\/\/www.htmlgoodies.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.htmlgoodies.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#webpage","url":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/","name":"Fetch Hyperlinked Files using Jsoup | HTML Goodies","isPartOf":{"@id":"https:\/\/www.htmlgoodies.com\/#website"},"datePublished":"2015-11-07T00:44:00+00:00","dateModified":"2015-11-07T00:44:00+00:00","description":"Fetch Hyperlinked Files using Jsoup In the Download Linked Resources using Jsoup tutorial, we learned how to select a specific hyperlink element based on","breadcrumb":{"@id":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.htmlgoodies.com\/"},{"@type":"ListItem","position":2,"name":"Fetch Hyperlinked Files using Jsoup"}]},{"@type":"Article","@id":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#article","isPartOf":{"@id":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#webpage"},"author":{"@id":"https:\/\/www.htmlgoodies.com\/#\/schema\/person\/d340101131281902e682ad0190b7ac75"},"headline":"Fetch Hyperlinked Files using Jsoup","datePublished":"2015-11-07T00:44:00+00:00","dateModified":"2015-11-07T00:44:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.htmlgoodies.com\/java\/fetch-hyperlinked-files-using-jsoup\/#webpage"},"wordCount":559,"publisher":{"@id":"https:\/\/www.htmlgoodies.com\/#organization"},"keywords":["download","HTML"],"articleSection":["Java"],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.htmlgoodies.com\/#\/schema\/person\/d340101131281902e682ad0190b7ac75","name":"Rob Gravelle","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.htmlgoodies.com\/#\/schema\/person\/image\/","url":"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/05\/rob-gravelle-150x150.jpg","contentUrl":"https:\/\/www.htmlgoodies.com\/wp-content\/uploads\/2021\/05\/rob-gravelle-150x150.jpg","caption":"Rob Gravelle"},"description":"Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.","url":"https:\/\/www.htmlgoodies.com\/author\/rob-gravelle\/"}]}},"_links":{"self":[{"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/posts\/2666"}],"collection":[{"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/users\/90"}],"replies":[{"embeddable":true,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/comments?post=2666"}],"version-history":[{"count":0,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/posts\/2666\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/media?parent=2666"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/categories?post=2666"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/tags?post=2666"},{"taxonomy":"b2b_audience","embeddable":true,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/b2b_audience?post=2666"},{"taxonomy":"b2b_industry","embeddable":true,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/b2b_industry?post=2666"},{"taxonomy":"b2b_product","embeddable":true,"href":"https:\/\/www.htmlgoodies.com\/wp-json\/wp\/v2\/b2b_product?post=2666"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}