Scraping from dropdown option value Python BeautifulSoup
I tried scraping data from the web with input dropdown with BeautifulSoup
this is value drop down
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
And I try like this
soup = BeautifulSoup(url, 'html.parser')
soup['selected'] = 'G1'
data = soup.findAll("table", {"style": "font-size:14px"})
print(data)
It will get data with <table>
tag each submit dropdown
but it only appears <table>
for the main page, how do I get data from each dropdown?
python python-3.x web-scraping beautifulsoup
add a comment |
I tried scraping data from the web with input dropdown with BeautifulSoup
this is value drop down
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
And I try like this
soup = BeautifulSoup(url, 'html.parser')
soup['selected'] = 'G1'
data = soup.findAll("table", {"style": "font-size:14px"})
print(data)
It will get data with <table>
tag each submit dropdown
but it only appears <table>
for the main page, how do I get data from each dropdown?
python python-3.x web-scraping beautifulsoup
add a comment |
I tried scraping data from the web with input dropdown with BeautifulSoup
this is value drop down
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
And I try like this
soup = BeautifulSoup(url, 'html.parser')
soup['selected'] = 'G1'
data = soup.findAll("table", {"style": "font-size:14px"})
print(data)
It will get data with <table>
tag each submit dropdown
but it only appears <table>
for the main page, how do I get data from each dropdown?
python python-3.x web-scraping beautifulsoup
I tried scraping data from the web with input dropdown with BeautifulSoup
this is value drop down
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
And I try like this
soup = BeautifulSoup(url, 'html.parser')
soup['selected'] = 'G1'
data = soup.findAll("table", {"style": "font-size:14px"})
print(data)
It will get data with <table>
tag each submit dropdown
but it only appears <table>
for the main page, how do I get data from each dropdown?
python python-3.x web-scraping beautifulsoup
python python-3.x web-scraping beautifulsoup
edited Nov 24 '18 at 15:07
Ilham Riski
asked Nov 24 '18 at 14:28
Ilham RiskiIlham Riski
197
197
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You still keep using findAll()
and find()
to finish your job.
from bs4 import BeautifulSoup
html = """
<table style="font-size:14px">
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
</table>
"""
soup = BeautifulSoup(html,"lxml")
option = soup.find("selected",{"name":"try"}).findAll("option")
option_ = soup.find("table", {"style": "font-size:14px"}).findAll("option")
print(option)
print(option_)
#[<option value="G1">1</option>, <option value="G2">2</option>]
#[<option value="G1">1</option>, <option value="G2">2</option>]
add a comment |
Try an attribute CSS selector
soup.select('option[value]')
The is an attribute selector. This looks for
option
tag elements with value
attribute. If there is a parent class/id that could be used that would be helpful in case there are more drop downs available on the page.
items = soup.select('option[value]')
values = [item.get('value') for item in items]
textValues = [item.text for item in items]
With parent name
attribute to limit to one dropdown (hopefully - you need to test and see if something further is required to sufficiently limit). Used with descendant combinator:
items = soup.select('[name=try] option[value]')
soup.select('option[G1]') like this?
– Ilham Riski
Nov 24 '18 at 14:33
Is there a parent id/class for the drop down?
– QHarr
Nov 24 '18 at 14:33
<selected name="try"> @QHarr
– Ilham Riski
Nov 24 '18 at 14:35
I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
– Ilham Riski
Nov 24 '18 at 14:40
The above will extract the text in those drop downs for each option as well as the values of the value attribute.
– QHarr
Nov 24 '18 at 14:42
|
show 7 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53459163%2fscraping-from-dropdown-option-value-python-beautifulsoup%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You still keep using findAll()
and find()
to finish your job.
from bs4 import BeautifulSoup
html = """
<table style="font-size:14px">
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
</table>
"""
soup = BeautifulSoup(html,"lxml")
option = soup.find("selected",{"name":"try"}).findAll("option")
option_ = soup.find("table", {"style": "font-size:14px"}).findAll("option")
print(option)
print(option_)
#[<option value="G1">1</option>, <option value="G2">2</option>]
#[<option value="G1">1</option>, <option value="G2">2</option>]
add a comment |
You still keep using findAll()
and find()
to finish your job.
from bs4 import BeautifulSoup
html = """
<table style="font-size:14px">
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
</table>
"""
soup = BeautifulSoup(html,"lxml")
option = soup.find("selected",{"name":"try"}).findAll("option")
option_ = soup.find("table", {"style": "font-size:14px"}).findAll("option")
print(option)
print(option_)
#[<option value="G1">1</option>, <option value="G2">2</option>]
#[<option value="G1">1</option>, <option value="G2">2</option>]
add a comment |
You still keep using findAll()
and find()
to finish your job.
from bs4 import BeautifulSoup
html = """
<table style="font-size:14px">
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
</table>
"""
soup = BeautifulSoup(html,"lxml")
option = soup.find("selected",{"name":"try"}).findAll("option")
option_ = soup.find("table", {"style": "font-size:14px"}).findAll("option")
print(option)
print(option_)
#[<option value="G1">1</option>, <option value="G2">2</option>]
#[<option value="G1">1</option>, <option value="G2">2</option>]
You still keep using findAll()
and find()
to finish your job.
from bs4 import BeautifulSoup
html = """
<table style="font-size:14px">
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
</table>
"""
soup = BeautifulSoup(html,"lxml")
option = soup.find("selected",{"name":"try"}).findAll("option")
option_ = soup.find("table", {"style": "font-size:14px"}).findAll("option")
print(option)
print(option_)
#[<option value="G1">1</option>, <option value="G2">2</option>]
#[<option value="G1">1</option>, <option value="G2">2</option>]
answered Nov 25 '18 at 3:20
kcorlidykcorlidy
2,2182518
2,2182518
add a comment |
add a comment |
Try an attribute CSS selector
soup.select('option[value]')
The is an attribute selector. This looks for
option
tag elements with value
attribute. If there is a parent class/id that could be used that would be helpful in case there are more drop downs available on the page.
items = soup.select('option[value]')
values = [item.get('value') for item in items]
textValues = [item.text for item in items]
With parent name
attribute to limit to one dropdown (hopefully - you need to test and see if something further is required to sufficiently limit). Used with descendant combinator:
items = soup.select('[name=try] option[value]')
soup.select('option[G1]') like this?
– Ilham Riski
Nov 24 '18 at 14:33
Is there a parent id/class for the drop down?
– QHarr
Nov 24 '18 at 14:33
<selected name="try"> @QHarr
– Ilham Riski
Nov 24 '18 at 14:35
I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
– Ilham Riski
Nov 24 '18 at 14:40
The above will extract the text in those drop downs for each option as well as the values of the value attribute.
– QHarr
Nov 24 '18 at 14:42
|
show 7 more comments
Try an attribute CSS selector
soup.select('option[value]')
The is an attribute selector. This looks for
option
tag elements with value
attribute. If there is a parent class/id that could be used that would be helpful in case there are more drop downs available on the page.
items = soup.select('option[value]')
values = [item.get('value') for item in items]
textValues = [item.text for item in items]
With parent name
attribute to limit to one dropdown (hopefully - you need to test and see if something further is required to sufficiently limit). Used with descendant combinator:
items = soup.select('[name=try] option[value]')
soup.select('option[G1]') like this?
– Ilham Riski
Nov 24 '18 at 14:33
Is there a parent id/class for the drop down?
– QHarr
Nov 24 '18 at 14:33
<selected name="try"> @QHarr
– Ilham Riski
Nov 24 '18 at 14:35
I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
– Ilham Riski
Nov 24 '18 at 14:40
The above will extract the text in those drop downs for each option as well as the values of the value attribute.
– QHarr
Nov 24 '18 at 14:42
|
show 7 more comments
Try an attribute CSS selector
soup.select('option[value]')
The is an attribute selector. This looks for
option
tag elements with value
attribute. If there is a parent class/id that could be used that would be helpful in case there are more drop downs available on the page.
items = soup.select('option[value]')
values = [item.get('value') for item in items]
textValues = [item.text for item in items]
With parent name
attribute to limit to one dropdown (hopefully - you need to test and see if something further is required to sufficiently limit). Used with descendant combinator:
items = soup.select('[name=try] option[value]')
Try an attribute CSS selector
soup.select('option[value]')
The is an attribute selector. This looks for
option
tag elements with value
attribute. If there is a parent class/id that could be used that would be helpful in case there are more drop downs available on the page.
items = soup.select('option[value]')
values = [item.get('value') for item in items]
textValues = [item.text for item in items]
With parent name
attribute to limit to one dropdown (hopefully - you need to test and see if something further is required to sufficiently limit). Used with descendant combinator:
items = soup.select('[name=try] option[value]')
edited Nov 24 '18 at 14:45
answered Nov 24 '18 at 14:29
QHarrQHarr
33.8k82043
33.8k82043
soup.select('option[G1]') like this?
– Ilham Riski
Nov 24 '18 at 14:33
Is there a parent id/class for the drop down?
– QHarr
Nov 24 '18 at 14:33
<selected name="try"> @QHarr
– Ilham Riski
Nov 24 '18 at 14:35
I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
– Ilham Riski
Nov 24 '18 at 14:40
The above will extract the text in those drop downs for each option as well as the values of the value attribute.
– QHarr
Nov 24 '18 at 14:42
|
show 7 more comments
soup.select('option[G1]') like this?
– Ilham Riski
Nov 24 '18 at 14:33
Is there a parent id/class for the drop down?
– QHarr
Nov 24 '18 at 14:33
<selected name="try"> @QHarr
– Ilham Riski
Nov 24 '18 at 14:35
I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
– Ilham Riski
Nov 24 '18 at 14:40
The above will extract the text in those drop downs for each option as well as the values of the value attribute.
– QHarr
Nov 24 '18 at 14:42
soup.select('option[G1]') like this?
– Ilham Riski
Nov 24 '18 at 14:33
soup.select('option[G1]') like this?
– Ilham Riski
Nov 24 '18 at 14:33
Is there a parent id/class for the drop down?
– QHarr
Nov 24 '18 at 14:33
Is there a parent id/class for the drop down?
– QHarr
Nov 24 '18 at 14:33
<selected name="try"> @QHarr
– Ilham Riski
Nov 24 '18 at 14:35
<selected name="try"> @QHarr
– Ilham Riski
Nov 24 '18 at 14:35
I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
– Ilham Riski
Nov 24 '18 at 14:40
I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
– Ilham Riski
Nov 24 '18 at 14:40
The above will extract the text in those drop downs for each option as well as the values of the value attribute.
– QHarr
Nov 24 '18 at 14:42
The above will extract the text in those drop downs for each option as well as the values of the value attribute.
– QHarr
Nov 24 '18 at 14:42
|
show 7 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53459163%2fscraping-from-dropdown-option-value-python-beautifulsoup%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown