How to get specific lines with Jsoup
This is the source code that I am trying to scrape with Jsoup. I am interested in taking data from following: "Code Number", "Date Available", "Type", "Breed", "Sex", "Age", "Weight" and "Adoption Fee". That is, I am looking for my output to be:
Code Number: 107796
Date Availabe: 11/20/2018
Type: Dog
Breed: German Shepherd Dog
Sex: Male
Age: 2 years, 0 months
Weight: 64.6 lbs
Adoption Fee: $250
Source code from:
view-source:https://southwesthumane.org/adopt/dogs/dog-details/?id=84807
lines 186-215
<div id="ContentPlaceHolder_Item3_AnimalDetails_2_divDetails">
<h3>Alan</h3>
<div class="float-to-right animal-slideshow">
<div class="cycle-slideshow" data-cycle-fx="Fade" data-cycle-timeout="0" data-cycle-auto-height="container" data-cycle-pager="#adv-custom-pager" data-cycle-pager-template="<a href='#'><img src='{{src}}' width=50 height=50></a>">
<img src="http://southwesthumanepets.shelterbuddy.com/photos/lostfound/84807.jpg" />
</div>
<div id="adv-custom-pager"></div>
</div>
<div class="AnimalDetails">
<p>Alan is looking for a new best friend! Could it be you? Alan is new to the shelter and we are still getting to know his unique personality. If Alan looks like your dream dog, let the staff know you are interested in meeting him. Going to a new home can be exciting and strange for pets, so it's best for them to meet any children and other dogs in their future home. Alan can't wait to meet his forever family!</p>
<br />
<strong>Code Number: </strong>107796
<br />
<strong>Date Available: </strong>11/20/2018
<br />
<strong>Type: </strong>Dog
<br />
<strong>Breed: </strong>German Shepherd Dog
<br />
<strong>Sex: </strong>Male
<br />
<strong>Age: </strong>2 years, 0 months
<br />
<strong>Weight: </strong>64.6 lbs
<br />
<strong>Adoption Fee: </strong>$250
<br />
<br />
</div>
</div>
Here is my code so far:
try{
Document dogs = Jsoup.connect("https://southwesthumane.org/adopt/dogs/").get();
Elements links_dogs = dogs.select(":containsOwn(Details »)");
for (Element link : links_dogs) {
String test = "https://southwesthumane.org" + link.attr("href");
System.out.println("url: " + test);
try{
Document dog = Jsoup.connect(test).get();
Elements name = dog.select("h3");
Elements description = dog.select("div.Animaldetails");
for (Element code : name) {
System.out.println("Name: " + code.text());
}
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.Animaldetails").text());
}
} catch (IOException e) {
e.printStackTrace();
}
}
This line:
System.out.println(code.select("div.Animaldetails").text());
is taking all the information I need but I do not know how to parse each individual line because ultimately I will save each individual information into a list. Any help would be greatly appreciated. Thank you for your time!
jsoup
add a comment |
This is the source code that I am trying to scrape with Jsoup. I am interested in taking data from following: "Code Number", "Date Available", "Type", "Breed", "Sex", "Age", "Weight" and "Adoption Fee". That is, I am looking for my output to be:
Code Number: 107796
Date Availabe: 11/20/2018
Type: Dog
Breed: German Shepherd Dog
Sex: Male
Age: 2 years, 0 months
Weight: 64.6 lbs
Adoption Fee: $250
Source code from:
view-source:https://southwesthumane.org/adopt/dogs/dog-details/?id=84807
lines 186-215
<div id="ContentPlaceHolder_Item3_AnimalDetails_2_divDetails">
<h3>Alan</h3>
<div class="float-to-right animal-slideshow">
<div class="cycle-slideshow" data-cycle-fx="Fade" data-cycle-timeout="0" data-cycle-auto-height="container" data-cycle-pager="#adv-custom-pager" data-cycle-pager-template="<a href='#'><img src='{{src}}' width=50 height=50></a>">
<img src="http://southwesthumanepets.shelterbuddy.com/photos/lostfound/84807.jpg" />
</div>
<div id="adv-custom-pager"></div>
</div>
<div class="AnimalDetails">
<p>Alan is looking for a new best friend! Could it be you? Alan is new to the shelter and we are still getting to know his unique personality. If Alan looks like your dream dog, let the staff know you are interested in meeting him. Going to a new home can be exciting and strange for pets, so it's best for them to meet any children and other dogs in their future home. Alan can't wait to meet his forever family!</p>
<br />
<strong>Code Number: </strong>107796
<br />
<strong>Date Available: </strong>11/20/2018
<br />
<strong>Type: </strong>Dog
<br />
<strong>Breed: </strong>German Shepherd Dog
<br />
<strong>Sex: </strong>Male
<br />
<strong>Age: </strong>2 years, 0 months
<br />
<strong>Weight: </strong>64.6 lbs
<br />
<strong>Adoption Fee: </strong>$250
<br />
<br />
</div>
</div>
Here is my code so far:
try{
Document dogs = Jsoup.connect("https://southwesthumane.org/adopt/dogs/").get();
Elements links_dogs = dogs.select(":containsOwn(Details »)");
for (Element link : links_dogs) {
String test = "https://southwesthumane.org" + link.attr("href");
System.out.println("url: " + test);
try{
Document dog = Jsoup.connect(test).get();
Elements name = dog.select("h3");
Elements description = dog.select("div.Animaldetails");
for (Element code : name) {
System.out.println("Name: " + code.text());
}
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.Animaldetails").text());
}
} catch (IOException e) {
e.printStackTrace();
}
}
This line:
System.out.println(code.select("div.Animaldetails").text());
is taking all the information I need but I do not know how to parse each individual line because ultimately I will save each individual information into a list. Any help would be greatly appreciated. Thank you for your time!
jsoup
add a comment |
This is the source code that I am trying to scrape with Jsoup. I am interested in taking data from following: "Code Number", "Date Available", "Type", "Breed", "Sex", "Age", "Weight" and "Adoption Fee". That is, I am looking for my output to be:
Code Number: 107796
Date Availabe: 11/20/2018
Type: Dog
Breed: German Shepherd Dog
Sex: Male
Age: 2 years, 0 months
Weight: 64.6 lbs
Adoption Fee: $250
Source code from:
view-source:https://southwesthumane.org/adopt/dogs/dog-details/?id=84807
lines 186-215
<div id="ContentPlaceHolder_Item3_AnimalDetails_2_divDetails">
<h3>Alan</h3>
<div class="float-to-right animal-slideshow">
<div class="cycle-slideshow" data-cycle-fx="Fade" data-cycle-timeout="0" data-cycle-auto-height="container" data-cycle-pager="#adv-custom-pager" data-cycle-pager-template="<a href='#'><img src='{{src}}' width=50 height=50></a>">
<img src="http://southwesthumanepets.shelterbuddy.com/photos/lostfound/84807.jpg" />
</div>
<div id="adv-custom-pager"></div>
</div>
<div class="AnimalDetails">
<p>Alan is looking for a new best friend! Could it be you? Alan is new to the shelter and we are still getting to know his unique personality. If Alan looks like your dream dog, let the staff know you are interested in meeting him. Going to a new home can be exciting and strange for pets, so it's best for them to meet any children and other dogs in their future home. Alan can't wait to meet his forever family!</p>
<br />
<strong>Code Number: </strong>107796
<br />
<strong>Date Available: </strong>11/20/2018
<br />
<strong>Type: </strong>Dog
<br />
<strong>Breed: </strong>German Shepherd Dog
<br />
<strong>Sex: </strong>Male
<br />
<strong>Age: </strong>2 years, 0 months
<br />
<strong>Weight: </strong>64.6 lbs
<br />
<strong>Adoption Fee: </strong>$250
<br />
<br />
</div>
</div>
Here is my code so far:
try{
Document dogs = Jsoup.connect("https://southwesthumane.org/adopt/dogs/").get();
Elements links_dogs = dogs.select(":containsOwn(Details »)");
for (Element link : links_dogs) {
String test = "https://southwesthumane.org" + link.attr("href");
System.out.println("url: " + test);
try{
Document dog = Jsoup.connect(test).get();
Elements name = dog.select("h3");
Elements description = dog.select("div.Animaldetails");
for (Element code : name) {
System.out.println("Name: " + code.text());
}
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.Animaldetails").text());
}
} catch (IOException e) {
e.printStackTrace();
}
}
This line:
System.out.println(code.select("div.Animaldetails").text());
is taking all the information I need but I do not know how to parse each individual line because ultimately I will save each individual information into a list. Any help would be greatly appreciated. Thank you for your time!
jsoup
This is the source code that I am trying to scrape with Jsoup. I am interested in taking data from following: "Code Number", "Date Available", "Type", "Breed", "Sex", "Age", "Weight" and "Adoption Fee". That is, I am looking for my output to be:
Code Number: 107796
Date Availabe: 11/20/2018
Type: Dog
Breed: German Shepherd Dog
Sex: Male
Age: 2 years, 0 months
Weight: 64.6 lbs
Adoption Fee: $250
Source code from:
view-source:https://southwesthumane.org/adopt/dogs/dog-details/?id=84807
lines 186-215
<div id="ContentPlaceHolder_Item3_AnimalDetails_2_divDetails">
<h3>Alan</h3>
<div class="float-to-right animal-slideshow">
<div class="cycle-slideshow" data-cycle-fx="Fade" data-cycle-timeout="0" data-cycle-auto-height="container" data-cycle-pager="#adv-custom-pager" data-cycle-pager-template="<a href='#'><img src='{{src}}' width=50 height=50></a>">
<img src="http://southwesthumanepets.shelterbuddy.com/photos/lostfound/84807.jpg" />
</div>
<div id="adv-custom-pager"></div>
</div>
<div class="AnimalDetails">
<p>Alan is looking for a new best friend! Could it be you? Alan is new to the shelter and we are still getting to know his unique personality. If Alan looks like your dream dog, let the staff know you are interested in meeting him. Going to a new home can be exciting and strange for pets, so it's best for them to meet any children and other dogs in their future home. Alan can't wait to meet his forever family!</p>
<br />
<strong>Code Number: </strong>107796
<br />
<strong>Date Available: </strong>11/20/2018
<br />
<strong>Type: </strong>Dog
<br />
<strong>Breed: </strong>German Shepherd Dog
<br />
<strong>Sex: </strong>Male
<br />
<strong>Age: </strong>2 years, 0 months
<br />
<strong>Weight: </strong>64.6 lbs
<br />
<strong>Adoption Fee: </strong>$250
<br />
<br />
</div>
</div>
Here is my code so far:
try{
Document dogs = Jsoup.connect("https://southwesthumane.org/adopt/dogs/").get();
Elements links_dogs = dogs.select(":containsOwn(Details »)");
for (Element link : links_dogs) {
String test = "https://southwesthumane.org" + link.attr("href");
System.out.println("url: " + test);
try{
Document dog = Jsoup.connect(test).get();
Elements name = dog.select("h3");
Elements description = dog.select("div.Animaldetails");
for (Element code : name) {
System.out.println("Name: " + code.text());
}
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.Animaldetails").text());
}
} catch (IOException e) {
e.printStackTrace();
}
}
This line:
System.out.println(code.select("div.Animaldetails").text());
is taking all the information I need but I do not know how to parse each individual line because ultimately I will save each individual information into a list. Any help would be greatly appreciated. Thank you for your time!
<div id="ContentPlaceHolder_Item3_AnimalDetails_2_divDetails">
<h3>Alan</h3>
<div class="float-to-right animal-slideshow">
<div class="cycle-slideshow" data-cycle-fx="Fade" data-cycle-timeout="0" data-cycle-auto-height="container" data-cycle-pager="#adv-custom-pager" data-cycle-pager-template="<a href='#'><img src='{{src}}' width=50 height=50></a>">
<img src="http://southwesthumanepets.shelterbuddy.com/photos/lostfound/84807.jpg" />
</div>
<div id="adv-custom-pager"></div>
</div>
<div class="AnimalDetails">
<p>Alan is looking for a new best friend! Could it be you? Alan is new to the shelter and we are still getting to know his unique personality. If Alan looks like your dream dog, let the staff know you are interested in meeting him. Going to a new home can be exciting and strange for pets, so it's best for them to meet any children and other dogs in their future home. Alan can't wait to meet his forever family!</p>
<br />
<strong>Code Number: </strong>107796
<br />
<strong>Date Available: </strong>11/20/2018
<br />
<strong>Type: </strong>Dog
<br />
<strong>Breed: </strong>German Shepherd Dog
<br />
<strong>Sex: </strong>Male
<br />
<strong>Age: </strong>2 years, 0 months
<br />
<strong>Weight: </strong>64.6 lbs
<br />
<strong>Adoption Fee: </strong>$250
<br />
<br />
</div>
</div>
<div id="ContentPlaceHolder_Item3_AnimalDetails_2_divDetails">
<h3>Alan</h3>
<div class="float-to-right animal-slideshow">
<div class="cycle-slideshow" data-cycle-fx="Fade" data-cycle-timeout="0" data-cycle-auto-height="container" data-cycle-pager="#adv-custom-pager" data-cycle-pager-template="<a href='#'><img src='{{src}}' width=50 height=50></a>">
<img src="http://southwesthumanepets.shelterbuddy.com/photos/lostfound/84807.jpg" />
</div>
<div id="adv-custom-pager"></div>
</div>
<div class="AnimalDetails">
<p>Alan is looking for a new best friend! Could it be you? Alan is new to the shelter and we are still getting to know his unique personality. If Alan looks like your dream dog, let the staff know you are interested in meeting him. Going to a new home can be exciting and strange for pets, so it's best for them to meet any children and other dogs in their future home. Alan can't wait to meet his forever family!</p>
<br />
<strong>Code Number: </strong>107796
<br />
<strong>Date Available: </strong>11/20/2018
<br />
<strong>Type: </strong>Dog
<br />
<strong>Breed: </strong>German Shepherd Dog
<br />
<strong>Sex: </strong>Male
<br />
<strong>Age: </strong>2 years, 0 months
<br />
<strong>Weight: </strong>64.6 lbs
<br />
<strong>Adoption Fee: </strong>$250
<br />
<br />
</div>
</div>
try{
Document dogs = Jsoup.connect("https://southwesthumane.org/adopt/dogs/").get();
Elements links_dogs = dogs.select(":containsOwn(Details »)");
for (Element link : links_dogs) {
String test = "https://southwesthumane.org" + link.attr("href");
System.out.println("url: " + test);
try{
Document dog = Jsoup.connect(test).get();
Elements name = dog.select("h3");
Elements description = dog.select("div.Animaldetails");
for (Element code : name) {
System.out.println("Name: " + code.text());
}
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.Animaldetails").text());
}
} catch (IOException e) {
e.printStackTrace();
}
}
try{
Document dogs = Jsoup.connect("https://southwesthumane.org/adopt/dogs/").get();
Elements links_dogs = dogs.select(":containsOwn(Details »)");
for (Element link : links_dogs) {
String test = "https://southwesthumane.org" + link.attr("href");
System.out.println("url: " + test);
try{
Document dog = Jsoup.connect(test).get();
Elements name = dog.select("h3");
Elements description = dog.select("div.Animaldetails");
for (Element code : name) {
System.out.println("Name: " + code.text());
}
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.Animaldetails").text());
}
} catch (IOException e) {
e.printStackTrace();
}
}
jsoup
jsoup
asked Nov 20 at 23:47
Samuel Serea
111
111
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
I checked @Eritrean answer, but I guess mine is a closer approach to get exactly what you are looking for in a more clear way! Here is a sample code to do exactly what you want to using JSOUP:
public class Main {
public static void main(String args) {
try {
String url = "https://southwesthumane.org/adopt/dogs/dog-details/?id=84807";
Document document = Jsoup.connect(url).userAgent("Mozilla/5.0").get();
Elements elements = document.select("div.AnimalDetails > strong");
for (Element element : elements) {
System.out.println(element.text() + element.nextSibling().toString());
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
As you can see, one you establish the connection to your desired URL you just need to select all the strong HTML tags contained inside the div HTML tag with class name AnimalDetails.
Once you do that you are going to get an Elements object from JSOUP, and you need to loop over it using a FOR EACH loop. In which you are going to get all the elements containing a strong HTML tag.
What you have to do now is to retrieve the text contained between those tags using the .text() selector from JSOUP, and as the HTML code is structured you need to retrieve the next element, that it's the value you are looking for.
As the HTML structure of the AnimalDetails div is like this:
<br>
<strong>Code Number: </strong>107796
<br>
<strong>Date Available: </strong>11/20/2018
<br>
...
and so on
You now need to get the sibling element of the strong HTML tag using the .nextSibling() selector from JSOUP and the convert it into a String using the .toString() method. This, as you can see, retrieves the value you are looking for. Then you just need to print it as your desired output as described in the new FOR EACH loop.
Your desired output will look as it follows:
Hope this helped you! For further information feel free to ask me!
add a comment |
You can select the strong HTML tags, and for each tag retrieved, get the nextSibling. Try it out by changing your for each loop:
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.AnimalDetails").text());
}
to:
for (Element code : description) {
Elements strongs = code.select("strong");
for(Element e : strongs){
System.out.println(e.text() + e.nextSibling().toString());
}
System.out.println();
}
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403313%2fhow-to-get-specific-lines-with-jsoup%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I checked @Eritrean answer, but I guess mine is a closer approach to get exactly what you are looking for in a more clear way! Here is a sample code to do exactly what you want to using JSOUP:
public class Main {
public static void main(String args) {
try {
String url = "https://southwesthumane.org/adopt/dogs/dog-details/?id=84807";
Document document = Jsoup.connect(url).userAgent("Mozilla/5.0").get();
Elements elements = document.select("div.AnimalDetails > strong");
for (Element element : elements) {
System.out.println(element.text() + element.nextSibling().toString());
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
As you can see, one you establish the connection to your desired URL you just need to select all the strong HTML tags contained inside the div HTML tag with class name AnimalDetails.
Once you do that you are going to get an Elements object from JSOUP, and you need to loop over it using a FOR EACH loop. In which you are going to get all the elements containing a strong HTML tag.
What you have to do now is to retrieve the text contained between those tags using the .text() selector from JSOUP, and as the HTML code is structured you need to retrieve the next element, that it's the value you are looking for.
As the HTML structure of the AnimalDetails div is like this:
<br>
<strong>Code Number: </strong>107796
<br>
<strong>Date Available: </strong>11/20/2018
<br>
...
and so on
You now need to get the sibling element of the strong HTML tag using the .nextSibling() selector from JSOUP and the convert it into a String using the .toString() method. This, as you can see, retrieves the value you are looking for. Then you just need to print it as your desired output as described in the new FOR EACH loop.
Your desired output will look as it follows:
Hope this helped you! For further information feel free to ask me!
add a comment |
I checked @Eritrean answer, but I guess mine is a closer approach to get exactly what you are looking for in a more clear way! Here is a sample code to do exactly what you want to using JSOUP:
public class Main {
public static void main(String args) {
try {
String url = "https://southwesthumane.org/adopt/dogs/dog-details/?id=84807";
Document document = Jsoup.connect(url).userAgent("Mozilla/5.0").get();
Elements elements = document.select("div.AnimalDetails > strong");
for (Element element : elements) {
System.out.println(element.text() + element.nextSibling().toString());
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
As you can see, one you establish the connection to your desired URL you just need to select all the strong HTML tags contained inside the div HTML tag with class name AnimalDetails.
Once you do that you are going to get an Elements object from JSOUP, and you need to loop over it using a FOR EACH loop. In which you are going to get all the elements containing a strong HTML tag.
What you have to do now is to retrieve the text contained between those tags using the .text() selector from JSOUP, and as the HTML code is structured you need to retrieve the next element, that it's the value you are looking for.
As the HTML structure of the AnimalDetails div is like this:
<br>
<strong>Code Number: </strong>107796
<br>
<strong>Date Available: </strong>11/20/2018
<br>
...
and so on
You now need to get the sibling element of the strong HTML tag using the .nextSibling() selector from JSOUP and the convert it into a String using the .toString() method. This, as you can see, retrieves the value you are looking for. Then you just need to print it as your desired output as described in the new FOR EACH loop.
Your desired output will look as it follows:
Hope this helped you! For further information feel free to ask me!
add a comment |
I checked @Eritrean answer, but I guess mine is a closer approach to get exactly what you are looking for in a more clear way! Here is a sample code to do exactly what you want to using JSOUP:
public class Main {
public static void main(String args) {
try {
String url = "https://southwesthumane.org/adopt/dogs/dog-details/?id=84807";
Document document = Jsoup.connect(url).userAgent("Mozilla/5.0").get();
Elements elements = document.select("div.AnimalDetails > strong");
for (Element element : elements) {
System.out.println(element.text() + element.nextSibling().toString());
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
As you can see, one you establish the connection to your desired URL you just need to select all the strong HTML tags contained inside the div HTML tag with class name AnimalDetails.
Once you do that you are going to get an Elements object from JSOUP, and you need to loop over it using a FOR EACH loop. In which you are going to get all the elements containing a strong HTML tag.
What you have to do now is to retrieve the text contained between those tags using the .text() selector from JSOUP, and as the HTML code is structured you need to retrieve the next element, that it's the value you are looking for.
As the HTML structure of the AnimalDetails div is like this:
<br>
<strong>Code Number: </strong>107796
<br>
<strong>Date Available: </strong>11/20/2018
<br>
...
and so on
You now need to get the sibling element of the strong HTML tag using the .nextSibling() selector from JSOUP and the convert it into a String using the .toString() method. This, as you can see, retrieves the value you are looking for. Then you just need to print it as your desired output as described in the new FOR EACH loop.
Your desired output will look as it follows:
Hope this helped you! For further information feel free to ask me!
I checked @Eritrean answer, but I guess mine is a closer approach to get exactly what you are looking for in a more clear way! Here is a sample code to do exactly what you want to using JSOUP:
public class Main {
public static void main(String args) {
try {
String url = "https://southwesthumane.org/adopt/dogs/dog-details/?id=84807";
Document document = Jsoup.connect(url).userAgent("Mozilla/5.0").get();
Elements elements = document.select("div.AnimalDetails > strong");
for (Element element : elements) {
System.out.println(element.text() + element.nextSibling().toString());
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
As you can see, one you establish the connection to your desired URL you just need to select all the strong HTML tags contained inside the div HTML tag with class name AnimalDetails.
Once you do that you are going to get an Elements object from JSOUP, and you need to loop over it using a FOR EACH loop. In which you are going to get all the elements containing a strong HTML tag.
What you have to do now is to retrieve the text contained between those tags using the .text() selector from JSOUP, and as the HTML code is structured you need to retrieve the next element, that it's the value you are looking for.
As the HTML structure of the AnimalDetails div is like this:
<br>
<strong>Code Number: </strong>107796
<br>
<strong>Date Available: </strong>11/20/2018
<br>
...
and so on
You now need to get the sibling element of the strong HTML tag using the .nextSibling() selector from JSOUP and the convert it into a String using the .toString() method. This, as you can see, retrieves the value you are looking for. Then you just need to print it as your desired output as described in the new FOR EACH loop.
Your desired output will look as it follows:
Hope this helped you! For further information feel free to ask me!
answered Nov 21 at 12:38
alvarobartt
12418
12418
add a comment |
add a comment |
You can select the strong HTML tags, and for each tag retrieved, get the nextSibling. Try it out by changing your for each loop:
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.AnimalDetails").text());
}
to:
for (Element code : description) {
Elements strongs = code.select("strong");
for(Element e : strongs){
System.out.println(e.text() + e.nextSibling().toString());
}
System.out.println();
}
add a comment |
You can select the strong HTML tags, and for each tag retrieved, get the nextSibling. Try it out by changing your for each loop:
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.AnimalDetails").text());
}
to:
for (Element code : description) {
Elements strongs = code.select("strong");
for(Element e : strongs){
System.out.println(e.text() + e.nextSibling().toString());
}
System.out.println();
}
add a comment |
You can select the strong HTML tags, and for each tag retrieved, get the nextSibling. Try it out by changing your for each loop:
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.AnimalDetails").text());
}
to:
for (Element code : description) {
Elements strongs = code.select("strong");
for(Element e : strongs){
System.out.println(e.text() + e.nextSibling().toString());
}
System.out.println();
}
You can select the strong HTML tags, and for each tag retrieved, get the nextSibling. Try it out by changing your for each loop:
for (Element code : description) {
System.out.println("Description: " + code.select("p").text());
System.out.println(code.select("strong").first().text());
System.out.println(code.select("div.AnimalDetails").text());
}
to:
for (Element code : description) {
Elements strongs = code.select("strong");
for(Element e : strongs){
System.out.println(e.text() + e.nextSibling().toString());
}
System.out.println();
}
edited Nov 21 at 12:44
alvarobartt
12418
12418
answered Nov 21 at 11:16
Eritrean
3,2071814
3,2071814
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403313%2fhow-to-get-specific-lines-with-jsoup%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown