History of YouTube channel's subscriber count

I'm trying to get data of a channel's subscriber count over time so I can fit some cool graphs to it. The program is pretty crappy and just rips the HTML from https://socialblade.com/youtube/user/pewdiepie/realtime and finds where the live sub count is. For some reason the HTML I get changes only like once an hour so I don't get as frequent data as I'd like (Has something to do with cache?). I don't know much of how networking stuff works in Java, I was just trying to put something together and really just wanted an easy way to get the data so that I can apply some machine learning or LoggerPro curve fitting to that. I couldn't find an easy fix for the problem searching on Google as I'm not really sure what the problem even is. Oh, also, can it be considered as a DOS attack or something if I automatically connect to their site every 10 seconds or so?

import java.io.BufferedReader;

import java.io.InputStreamReader;

import java.io.PrintWriter;

import java.net.URL;



public class Main {

    public static void main(String args) throws Exception {



        //String data = "";

        PrintWriter out = new PrintWriter("pewDieSubs"+ System.currentTimeMillis()+".txt");



        long lastTime = System.currentTimeMillis();

        long deltaTime = 0;



        System.setProperty("http.agent", "Chrome");

        URL url = new URL("https://socialblade.com/youtube/user/pewdiepie/realtime");

        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));



        String inputLine;

        String lastInputLine = "";

        while ((inputLine = in.readLine()) != null) {

            if (inputLine.contains("<p id="rawCount" style="display: none;">")) {



                if (!inputLine.equals(lastInputLine)) {



                    lastInputLine = inputLine;

                    deltaTime = System.currentTimeMillis() - lastTime;

                    lastTime = System.currentTimeMillis();



                    System.out.println(inputLine);



                    String tmp = "";

                    for (int i = 0; i < 8; i++) {

                        tmp = tmp + inputLine.charAt(40 + i);

                    }

                    System.out.println(tmp + " ---  deltaTime = " + deltaTime);



                    //data = data + "n" + lastTime + "   " + tmp;



                    out.println(lastTime + " " + tmp);

                    out.flush();





                }



                in.close();

                in = new BufferedReader(new InputStreamReader(url.openStream()));

                Thread.sleep(10000);

            }

        }



        in.close();

        out.close();



}

}

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

asked Nov 24 '18 at 13:30

Auruttch

113

There are several problems with your approach. First of all the way you retrieve a channel's views is quite tiresome and prone to error. YouTube offers an API that can provide this information. This will save parsing and scraping raw HTML. If you need to check periodically you need to create a new thread that will run periodically. What you do at the moment is to run everything in the main thread which you then instruct to sleep for 10 seconds. Your process is not repeatable and will only run once. I suggest you take a look at both the API and Java threads.

– Aris_Kortex
Nov 24 '18 at 14:07

@Aris_Kortex Believe it or not, but I have run the code and it does not only run once, it does repeatedly get the HTML and as I said, the problem is that the HTML it gets back only changes ~once an hour. But it does change and I do get new data, just not as frequently as I'd wish. You are right in that I should probably use the Youtube API. I just thought that if I could get this thing working with a minor fix, I could move onto what I actually wanted to do with the data. It'd also be interesting to get this to work so I could in the future use it for some application where there's no API.

– Auruttch
Nov 24 '18 at 14:53

Got it working using the youtube API. Thank you @Aris_Kortex

– Auruttch
Nov 25 '18 at 13:26

I can very well turns this into a proper answer so that you can upvote if you like.

– Aris_Kortex
Nov 28 '18 at 17:02

@Aris_Kortex Well, I actually used the API in kind of a cheap way and just scraped the html from googleapis.com/youtube/v3/…. Didn't change the code much, but the data I'm getting from this site updates every time I get it again. I really just wanted to move on to analyzing the data and this works well enough. I still don't know why social blade doesn't give me fresh data, so I don't know if I've solved the question I asked in the first place. You can still post a proper answer if you think it could be helpful to others.

– Auruttch
Dec 6 '18 at 20:22

add a comment |

import java.io.BufferedReader;

import java.io.InputStreamReader;

import java.io.PrintWriter;

import java.net.URL;



public class Main {

    public static void main(String args) throws Exception {



        //String data = "";

        PrintWriter out = new PrintWriter("pewDieSubs"+ System.currentTimeMillis()+".txt");



        long lastTime = System.currentTimeMillis();

        long deltaTime = 0;



        System.setProperty("http.agent", "Chrome");

        URL url = new URL("https://socialblade.com/youtube/user/pewdiepie/realtime");

        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));



        String inputLine;

        String lastInputLine = "";

        while ((inputLine = in.readLine()) != null) {

            if (inputLine.contains("<p id="rawCount" style="display: none;">")) {



                if (!inputLine.equals(lastInputLine)) {



                    lastInputLine = inputLine;

                    deltaTime = System.currentTimeMillis() - lastTime;

                    lastTime = System.currentTimeMillis();



                    System.out.println(inputLine);



                    String tmp = "";

                    for (int i = 0; i < 8; i++) {

                        tmp = tmp + inputLine.charAt(40 + i);

                    }

                    System.out.println(tmp + " ---  deltaTime = " + deltaTime);



                    //data = data + "n" + lastTime + "   " + tmp;



                    out.println(lastTime + " " + tmp);

                    out.flush();





                }



                in.close();

                in = new BufferedReader(new InputStreamReader(url.openStream()));

                Thread.sleep(10000);

            }

        }



        in.close();

        out.close();



}

}

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

asked Nov 24 '18 at 13:30

Auruttch

113

There are several problems with your approach. First of all the way you retrieve a channel's views is quite tiresome and prone to error. YouTube offers an API that can provide this information. This will save parsing and scraping raw HTML. If you need to check periodically you need to create a new thread that will run periodically. What you do at the moment is to run everything in the main thread which you then instruct to sleep for 10 seconds. Your process is not repeatable and will only run once. I suggest you take a look at both the API and Java threads.

– Aris_Kortex
Nov 24 '18 at 14:07

@Aris_Kortex Believe it or not, but I have run the code and it does not only run once, it does repeatedly get the HTML and as I said, the problem is that the HTML it gets back only changes ~once an hour. But it does change and I do get new data, just not as frequently as I'd wish. You are right in that I should probably use the Youtube API. I just thought that if I could get this thing working with a minor fix, I could move onto what I actually wanted to do with the data. It'd also be interesting to get this to work so I could in the future use it for some application where there's no API.

– Auruttch
Nov 24 '18 at 14:53

Got it working using the youtube API. Thank you @Aris_Kortex

– Auruttch
Nov 25 '18 at 13:26

I can very well turns this into a proper answer so that you can upvote if you like.

– Aris_Kortex
Nov 28 '18 at 17:02

@Aris_Kortex Well, I actually used the API in kind of a cheap way and just scraped the html from googleapis.com/youtube/v3/…. Didn't change the code much, but the data I'm getting from this site updates every time I get it again. I really just wanted to move on to analyzing the data and this works well enough. I still don't know why social blade doesn't give me fresh data, so I don't know if I've solved the question I asked in the first place. You can still post a proper answer if you think it could be helpful to others.

– Auruttch
Dec 6 '18 at 20:22

add a comment |

import java.io.BufferedReader;

import java.io.InputStreamReader;

import java.io.PrintWriter;

import java.net.URL;



public class Main {

    public static void main(String args) throws Exception {



        //String data = "";

        PrintWriter out = new PrintWriter("pewDieSubs"+ System.currentTimeMillis()+".txt");



        long lastTime = System.currentTimeMillis();

        long deltaTime = 0;



        System.setProperty("http.agent", "Chrome");

        URL url = new URL("https://socialblade.com/youtube/user/pewdiepie/realtime");

        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));



        String inputLine;

        String lastInputLine = "";

        while ((inputLine = in.readLine()) != null) {

            if (inputLine.contains("<p id="rawCount" style="display: none;">")) {



                if (!inputLine.equals(lastInputLine)) {



                    lastInputLine = inputLine;

                    deltaTime = System.currentTimeMillis() - lastTime;

                    lastTime = System.currentTimeMillis();



                    System.out.println(inputLine);



                    String tmp = "";

                    for (int i = 0; i < 8; i++) {

                        tmp = tmp + inputLine.charAt(40 + i);

                    }

                    System.out.println(tmp + " ---  deltaTime = " + deltaTime);



                    //data = data + "n" + lastTime + "   " + tmp;



                    out.println(lastTime + " " + tmp);

                    out.flush();





                }



                in.close();

                in = new BufferedReader(new InputStreamReader(url.openStream()));

                Thread.sleep(10000);

            }

        }



        in.close();

        out.close();



}

}

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

asked Nov 24 '18 at 13:30

Auruttch

113

import java.io.BufferedReader;

import java.io.InputStreamReader;

import java.io.PrintWriter;

import java.net.URL;



public class Main {

    public static void main(String args) throws Exception {



        //String data = "";

        PrintWriter out = new PrintWriter("pewDieSubs"+ System.currentTimeMillis()+".txt");



        long lastTime = System.currentTimeMillis();

        long deltaTime = 0;



        System.setProperty("http.agent", "Chrome");

        URL url = new URL("https://socialblade.com/youtube/user/pewdiepie/realtime");

        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));



        String inputLine;

        String lastInputLine = "";

        while ((inputLine = in.readLine()) != null) {

            if (inputLine.contains("<p id="rawCount" style="display: none;">")) {



                if (!inputLine.equals(lastInputLine)) {



                    lastInputLine = inputLine;

                    deltaTime = System.currentTimeMillis() - lastTime;

                    lastTime = System.currentTimeMillis();



                    System.out.println(inputLine);



                    String tmp = "";

                    for (int i = 0; i < 8; i++) {

                        tmp = tmp + inputLine.charAt(40 + i);

                    }

                    System.out.println(tmp + " ---  deltaTime = " + deltaTime);



                    //data = data + "n" + lastTime + "   " + tmp;



                    out.println(lastTime + " " + tmp);

                    out.flush();





                }



                in.close();

                in = new BufferedReader(new InputStreamReader(url.openStream()));

                Thread.sleep(10000);

            }

        }



        in.close();

        out.close();



}

}

java web-scraping

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

asked Nov 24 '18 at 13:30

Auruttch

113

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

asked Nov 24 '18 at 13:30

Auruttch

113

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

edited Nov 24 '18 at 13:33

Robin Green

22.5k876156

asked Nov 24 '18 at 13:30

Auruttch

113

asked Nov 24 '18 at 13:30

Auruttch

113

asked Nov 24 '18 at 13:30

Auruttch

113

There are several problems with your approach. First of all the way you retrieve a channel's views is quite tiresome and prone to error. YouTube offers an API that can provide this information. This will save parsing and scraping raw HTML. If you need to check periodically you need to create a new thread that will run periodically. What you do at the moment is to run everything in the main thread which you then instruct to sleep for 10 seconds. Your process is not repeatable and will only run once. I suggest you take a look at both the API and Java threads.

– Aris_Kortex
Nov 24 '18 at 14:07

@Aris_Kortex Believe it or not, but I have run the code and it does not only run once, it does repeatedly get the HTML and as I said, the problem is that the HTML it gets back only changes ~once an hour. But it does change and I do get new data, just not as frequently as I'd wish. You are right in that I should probably use the Youtube API. I just thought that if I could get this thing working with a minor fix, I could move onto what I actually wanted to do with the data. It'd also be interesting to get this to work so I could in the future use it for some application where there's no API.

– Auruttch
Nov 24 '18 at 14:53

Got it working using the youtube API. Thank you @Aris_Kortex

– Auruttch
Nov 25 '18 at 13:26

I can very well turns this into a proper answer so that you can upvote if you like.

– Aris_Kortex
Nov 28 '18 at 17:02

@Aris_Kortex Well, I actually used the API in kind of a cheap way and just scraped the html from googleapis.com/youtube/v3/…. Didn't change the code much, but the data I'm getting from this site updates every time I get it again. I really just wanted to move on to analyzing the data and this works well enough. I still don't know why social blade doesn't give me fresh data, so I don't know if I've solved the question I asked in the first place. You can still post a proper answer if you think it could be helpful to others.

– Auruttch
Dec 6 '18 at 20:22

add a comment |

There are several problems with your approach. First of all the way you retrieve a channel's views is quite tiresome and prone to error. YouTube offers an API that can provide this information. This will save parsing and scraping raw HTML. If you need to check periodically you need to create a new thread that will run periodically. What you do at the moment is to run everything in the main thread which you then instruct to sleep for 10 seconds. Your process is not repeatable and will only run once. I suggest you take a look at both the API and Java threads.

– Aris_Kortex
Nov 24 '18 at 14:07

@Aris_Kortex Believe it or not, but I have run the code and it does not only run once, it does repeatedly get the HTML and as I said, the problem is that the HTML it gets back only changes ~once an hour. But it does change and I do get new data, just not as frequently as I'd wish. You are right in that I should probably use the Youtube API. I just thought that if I could get this thing working with a minor fix, I could move onto what I actually wanted to do with the data. It'd also be interesting to get this to work so I could in the future use it for some application where there's no API.

– Auruttch
Nov 24 '18 at 14:53

Got it working using the youtube API. Thank you @Aris_Kortex

– Auruttch
Nov 25 '18 at 13:26

I can very well turns this into a proper answer so that you can upvote if you like.

– Aris_Kortex
Nov 28 '18 at 17:02

@Aris_Kortex Well, I actually used the API in kind of a cheap way and just scraped the html from googleapis.com/youtube/v3/…. Didn't change the code much, but the data I'm getting from this site updates every time I get it again. I really just wanted to move on to analyzing the data and this works well enough. I still don't know why social blade doesn't give me fresh data, so I don't know if I've solved the question I asked in the first place. You can still post a proper answer if you think it could be helpful to others.

– Auruttch
Dec 6 '18 at 20:22

There are several problems with your approach. First of all the way you retrieve a channel's views is quite tiresome and prone to error. YouTube offers an API that can provide this information. This will save parsing and scraping raw HTML. If you need to check periodically you need to create a new thread that will run periodically. What you do at the moment is to run everything in the main thread which you then instruct to sleep for 10 seconds. Your process is not repeatable and will only run once. I suggest you take a look at both the API and Java threads.

– Aris_Kortex
Nov 24 '18 at 14:07

@Aris_Kortex Believe it or not, but I have run the code and it does not only run once, it does repeatedly get the HTML and as I said, the problem is that the HTML it gets back only changes ~once an hour. But it does change and I do get new data, just not as frequently as I'd wish. You are right in that I should probably use the Youtube API. I just thought that if I could get this thing working with a minor fix, I could move onto what I actually wanted to do with the data. It'd also be interesting to get this to work so I could in the future use it for some application where there's no API.

– Auruttch
Nov 24 '18 at 14:53

Got it working using the youtube API. Thank you @Aris_Kortex

– Auruttch
Nov 25 '18 at 13:26

I can very well turns this into a proper answer so that you can upvote if you like.

– Aris_Kortex
Nov 28 '18 at 17:02

@Aris_Kortex Well, I actually used the API in kind of a cheap way and just scraped the html from googleapis.com/youtube/v3/…. Didn't change the code much, but the data I'm getting from this site updates every time I get it again. I really just wanted to move on to analyzing the data and this works well enough. I still don't know why social blade doesn't give me fresh data, so I don't know if I've solved the question I asked in the first place. You can still post a proper answer if you think it could be helpful to others.

– Auruttch
Dec 6 '18 at 20:22

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53458645%2fhistory-of-youtube-channels-subscriber-count%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

QR0M8Eb5,tV,5 Ip,j6aBPMy83,xbqzIN 3mqy898Vt1Qmm86Tk5Wg8eXvJykd jOFfb hKSNc2PqA7xUP82 P mQ1J6NYf1gFP,SUa8d9WhDB

搜尋此網誌

Ytukyg