How to find first set?











up vote
0
down vote

favorite












I am trying to list the First set of a given grammar with this function:



Note:
char c - the character to find the first set;
first_set - store elements of the corresponding first set;
q1, q2 - the previous position;
rule- store all the grammar rule line by line listed below;

for the first time the parameters are ('S', 0, 0).



void findfirst(char c, int q1, int q2){
if(!(isupper(c)) || c=='$'){
first_set[n++] = c;
}
for(int j=0;j<rule_number;j++){
if(rule[j][0]==c){
if(rule[j][2]==';'){
if(rule[q1][q2]=='')
first_set[n++] = ';';
else if(rule[q1][q2]!='' &&(q1!=0||q2!=0))
findfirst(rule[q1][q2], q1, (q2+1));
else
first_set[n++] = ';';
}
else if(!isupper(rule[j][2]) || rule[j][2]=='$')
first_set[n++] = rule[j][2];
else
findfirst(rule[j][2],j,3);
}
}
}


But found that if the given grammar looks like this:



S AC$
C c
C ;
A aBCd
A BQ
B bB
B ;
Q q
Q ;


(which the left hand side or any capital letters in the right hand side are non-terminal, and any small case letters are terminal)
the function couldn't correctly output the first set for S, since it will stop at finding the first set of Q and store ';' to the first set and won't go on to find C's first set.



Does anyone have a clue? Thanks in advance.










share|improve this question




















  • 3




    Get in the habit of documenting your code and using descriptive variable names.
    – paddy
    Nov 19 at 11:18















up vote
0
down vote

favorite












I am trying to list the First set of a given grammar with this function:



Note:
char c - the character to find the first set;
first_set - store elements of the corresponding first set;
q1, q2 - the previous position;
rule- store all the grammar rule line by line listed below;

for the first time the parameters are ('S', 0, 0).



void findfirst(char c, int q1, int q2){
if(!(isupper(c)) || c=='$'){
first_set[n++] = c;
}
for(int j=0;j<rule_number;j++){
if(rule[j][0]==c){
if(rule[j][2]==';'){
if(rule[q1][q2]=='')
first_set[n++] = ';';
else if(rule[q1][q2]!='' &&(q1!=0||q2!=0))
findfirst(rule[q1][q2], q1, (q2+1));
else
first_set[n++] = ';';
}
else if(!isupper(rule[j][2]) || rule[j][2]=='$')
first_set[n++] = rule[j][2];
else
findfirst(rule[j][2],j,3);
}
}
}


But found that if the given grammar looks like this:



S AC$
C c
C ;
A aBCd
A BQ
B bB
B ;
Q q
Q ;


(which the left hand side or any capital letters in the right hand side are non-terminal, and any small case letters are terminal)
the function couldn't correctly output the first set for S, since it will stop at finding the first set of Q and store ';' to the first set and won't go on to find C's first set.



Does anyone have a clue? Thanks in advance.










share|improve this question




















  • 3




    Get in the habit of documenting your code and using descriptive variable names.
    – paddy
    Nov 19 at 11:18













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I am trying to list the First set of a given grammar with this function:



Note:
char c - the character to find the first set;
first_set - store elements of the corresponding first set;
q1, q2 - the previous position;
rule- store all the grammar rule line by line listed below;

for the first time the parameters are ('S', 0, 0).



void findfirst(char c, int q1, int q2){
if(!(isupper(c)) || c=='$'){
first_set[n++] = c;
}
for(int j=0;j<rule_number;j++){
if(rule[j][0]==c){
if(rule[j][2]==';'){
if(rule[q1][q2]=='')
first_set[n++] = ';';
else if(rule[q1][q2]!='' &&(q1!=0||q2!=0))
findfirst(rule[q1][q2], q1, (q2+1));
else
first_set[n++] = ';';
}
else if(!isupper(rule[j][2]) || rule[j][2]=='$')
first_set[n++] = rule[j][2];
else
findfirst(rule[j][2],j,3);
}
}
}


But found that if the given grammar looks like this:



S AC$
C c
C ;
A aBCd
A BQ
B bB
B ;
Q q
Q ;


(which the left hand side or any capital letters in the right hand side are non-terminal, and any small case letters are terminal)
the function couldn't correctly output the first set for S, since it will stop at finding the first set of Q and store ';' to the first set and won't go on to find C's first set.



Does anyone have a clue? Thanks in advance.










share|improve this question















I am trying to list the First set of a given grammar with this function:



Note:
char c - the character to find the first set;
first_set - store elements of the corresponding first set;
q1, q2 - the previous position;
rule- store all the grammar rule line by line listed below;

for the first time the parameters are ('S', 0, 0).



void findfirst(char c, int q1, int q2){
if(!(isupper(c)) || c=='$'){
first_set[n++] = c;
}
for(int j=0;j<rule_number;j++){
if(rule[j][0]==c){
if(rule[j][2]==';'){
if(rule[q1][q2]=='')
first_set[n++] = ';';
else if(rule[q1][q2]!='' &&(q1!=0||q2!=0))
findfirst(rule[q1][q2], q1, (q2+1));
else
first_set[n++] = ';';
}
else if(!isupper(rule[j][2]) || rule[j][2]=='$')
first_set[n++] = rule[j][2];
else
findfirst(rule[j][2],j,3);
}
}
}


But found that if the given grammar looks like this:



S AC$
C c
C ;
A aBCd
A BQ
B bB
B ;
Q q
Q ;


(which the left hand side or any capital letters in the right hand side are non-terminal, and any small case letters are terminal)
the function couldn't correctly output the first set for S, since it will stop at finding the first set of Q and store ';' to the first set and won't go on to find C's first set.



Does anyone have a clue? Thanks in advance.







c++ compiler-construction






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 at 10:53









Rhathin

5411313




5411313










asked Nov 19 at 10:40









Lykas

377




377








  • 3




    Get in the habit of documenting your code and using descriptive variable names.
    – paddy
    Nov 19 at 11:18














  • 3




    Get in the habit of documenting your code and using descriptive variable names.
    – paddy
    Nov 19 at 11:18








3




3




Get in the habit of documenting your code and using descriptive variable names.
– paddy
Nov 19 at 11:18




Get in the habit of documenting your code and using descriptive variable names.
– paddy
Nov 19 at 11:18












2 Answers
2






active

oldest

votes

















up vote
1
down vote



accepted










It is extremely inefficient to compute FIRST sets one at a time, since they are interdependent. For example, in order to compute the FIRST set of A , you need to also compute the FIRST set of B, and then because B can derive the emoty string, you need the FIRST set of Q.



Most algorithms compute all of them in parallel, using some variation of a transitive closure algorithm. You can do this with a depth-first search, which seems to be what you are attempting, but it might be easier to implement the least fixed point algorithm described in the Dragon book (and Wikipedia.



Either way, you will probably find it easier to first compute NULLABLE (that is, which non-terminals derive the empty set). There is a simple linear-time algorithm for that (linear in the size of the grammar), which again is easy to find.



If you are doing this work as part of a class, you'll probably find the relevant algorithms in your course materials. Alternatively, you can look for a copy of the Dragon book or other similar text books.






share|improve this answer























  • Thanks a lot. I didn't even think of efficiency :(
    – Lykas
    2 days ago


















up vote
0
down vote













You could do like the following code:



used[i] means the rule[i] is used or not



The method is Depth-first search, see https://en.wikipedia.org/wiki/Depth-first_search



#include <iostream>

#define MAX_SIZE 1024

char rule[10] = {
"S AC$",
"C c",
"C ;",
"A aBCd",
"A BQ",
"B bB",
"B ;",
"Q q",
"Q ;"
};

constexpr int rule_number = sizeof(rule) / sizeof(rule[0]);

char first_set[MAX_SIZE];

bool findfirst(int row, int col, int *n, bool* used) {
for (;;) {
char ch = rule[row][col];
if (ch == '$' || ch == ';' || ch == '') {
first_set[*n] = '';
break;
}
if (islower(ch)) {
first_set[(*n)++] = ch;
++col;
continue;
}
int i;
for (i = 0; i != rule_number; ++i) {
if (used[i] == true || rule[i][0] != ch)
continue;
used[i] = true;
int k = *n;
if (findfirst(i, 2, n, used) == true)
break;
used[i] = false;
*n = k;
}
if (i == rule_number)
return false;
++col;
}
return true;
}

int main() {
bool used[rule_number];
int n = 0;
for (int i = 2; rule[0][i] != '$' && rule[0][i] != ''; ++i) {
for (int j = 0; j != rule_number; ++j)
used[j] = false;
used[0] = true;
findfirst(0, i, &n, used);
}
std::cout << first_set << std::endl;
return 0;
}





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53372838%2fhow-to-find-first-set%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    It is extremely inefficient to compute FIRST sets one at a time, since they are interdependent. For example, in order to compute the FIRST set of A , you need to also compute the FIRST set of B, and then because B can derive the emoty string, you need the FIRST set of Q.



    Most algorithms compute all of them in parallel, using some variation of a transitive closure algorithm. You can do this with a depth-first search, which seems to be what you are attempting, but it might be easier to implement the least fixed point algorithm described in the Dragon book (and Wikipedia.



    Either way, you will probably find it easier to first compute NULLABLE (that is, which non-terminals derive the empty set). There is a simple linear-time algorithm for that (linear in the size of the grammar), which again is easy to find.



    If you are doing this work as part of a class, you'll probably find the relevant algorithms in your course materials. Alternatively, you can look for a copy of the Dragon book or other similar text books.






    share|improve this answer























    • Thanks a lot. I didn't even think of efficiency :(
      – Lykas
      2 days ago















    up vote
    1
    down vote



    accepted










    It is extremely inefficient to compute FIRST sets one at a time, since they are interdependent. For example, in order to compute the FIRST set of A , you need to also compute the FIRST set of B, and then because B can derive the emoty string, you need the FIRST set of Q.



    Most algorithms compute all of them in parallel, using some variation of a transitive closure algorithm. You can do this with a depth-first search, which seems to be what you are attempting, but it might be easier to implement the least fixed point algorithm described in the Dragon book (and Wikipedia.



    Either way, you will probably find it easier to first compute NULLABLE (that is, which non-terminals derive the empty set). There is a simple linear-time algorithm for that (linear in the size of the grammar), which again is easy to find.



    If you are doing this work as part of a class, you'll probably find the relevant algorithms in your course materials. Alternatively, you can look for a copy of the Dragon book or other similar text books.






    share|improve this answer























    • Thanks a lot. I didn't even think of efficiency :(
      – Lykas
      2 days ago













    up vote
    1
    down vote



    accepted







    up vote
    1
    down vote



    accepted






    It is extremely inefficient to compute FIRST sets one at a time, since they are interdependent. For example, in order to compute the FIRST set of A , you need to also compute the FIRST set of B, and then because B can derive the emoty string, you need the FIRST set of Q.



    Most algorithms compute all of them in parallel, using some variation of a transitive closure algorithm. You can do this with a depth-first search, which seems to be what you are attempting, but it might be easier to implement the least fixed point algorithm described in the Dragon book (and Wikipedia.



    Either way, you will probably find it easier to first compute NULLABLE (that is, which non-terminals derive the empty set). There is a simple linear-time algorithm for that (linear in the size of the grammar), which again is easy to find.



    If you are doing this work as part of a class, you'll probably find the relevant algorithms in your course materials. Alternatively, you can look for a copy of the Dragon book or other similar text books.






    share|improve this answer














    It is extremely inefficient to compute FIRST sets one at a time, since they are interdependent. For example, in order to compute the FIRST set of A , you need to also compute the FIRST set of B, and then because B can derive the emoty string, you need the FIRST set of Q.



    Most algorithms compute all of them in parallel, using some variation of a transitive closure algorithm. You can do this with a depth-first search, which seems to be what you are attempting, but it might be easier to implement the least fixed point algorithm described in the Dragon book (and Wikipedia.



    Either way, you will probably find it easier to first compute NULLABLE (that is, which non-terminals derive the empty set). There is a simple linear-time algorithm for that (linear in the size of the grammar), which again is easy to find.



    If you are doing this work as part of a class, you'll probably find the relevant algorithms in your course materials. Alternatively, you can look for a copy of the Dragon book or other similar text books.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 19 at 16:06

























    answered Nov 19 at 15:50









    rici

    150k19128192




    150k19128192












    • Thanks a lot. I didn't even think of efficiency :(
      – Lykas
      2 days ago


















    • Thanks a lot. I didn't even think of efficiency :(
      – Lykas
      2 days ago
















    Thanks a lot. I didn't even think of efficiency :(
    – Lykas
    2 days ago




    Thanks a lot. I didn't even think of efficiency :(
    – Lykas
    2 days ago












    up vote
    0
    down vote













    You could do like the following code:



    used[i] means the rule[i] is used or not



    The method is Depth-first search, see https://en.wikipedia.org/wiki/Depth-first_search



    #include <iostream>

    #define MAX_SIZE 1024

    char rule[10] = {
    "S AC$",
    "C c",
    "C ;",
    "A aBCd",
    "A BQ",
    "B bB",
    "B ;",
    "Q q",
    "Q ;"
    };

    constexpr int rule_number = sizeof(rule) / sizeof(rule[0]);

    char first_set[MAX_SIZE];

    bool findfirst(int row, int col, int *n, bool* used) {
    for (;;) {
    char ch = rule[row][col];
    if (ch == '$' || ch == ';' || ch == '') {
    first_set[*n] = '';
    break;
    }
    if (islower(ch)) {
    first_set[(*n)++] = ch;
    ++col;
    continue;
    }
    int i;
    for (i = 0; i != rule_number; ++i) {
    if (used[i] == true || rule[i][0] != ch)
    continue;
    used[i] = true;
    int k = *n;
    if (findfirst(i, 2, n, used) == true)
    break;
    used[i] = false;
    *n = k;
    }
    if (i == rule_number)
    return false;
    ++col;
    }
    return true;
    }

    int main() {
    bool used[rule_number];
    int n = 0;
    for (int i = 2; rule[0][i] != '$' && rule[0][i] != ''; ++i) {
    for (int j = 0; j != rule_number; ++j)
    used[j] = false;
    used[0] = true;
    findfirst(0, i, &n, used);
    }
    std::cout << first_set << std::endl;
    return 0;
    }





    share|improve this answer

























      up vote
      0
      down vote













      You could do like the following code:



      used[i] means the rule[i] is used or not



      The method is Depth-first search, see https://en.wikipedia.org/wiki/Depth-first_search



      #include <iostream>

      #define MAX_SIZE 1024

      char rule[10] = {
      "S AC$",
      "C c",
      "C ;",
      "A aBCd",
      "A BQ",
      "B bB",
      "B ;",
      "Q q",
      "Q ;"
      };

      constexpr int rule_number = sizeof(rule) / sizeof(rule[0]);

      char first_set[MAX_SIZE];

      bool findfirst(int row, int col, int *n, bool* used) {
      for (;;) {
      char ch = rule[row][col];
      if (ch == '$' || ch == ';' || ch == '') {
      first_set[*n] = '';
      break;
      }
      if (islower(ch)) {
      first_set[(*n)++] = ch;
      ++col;
      continue;
      }
      int i;
      for (i = 0; i != rule_number; ++i) {
      if (used[i] == true || rule[i][0] != ch)
      continue;
      used[i] = true;
      int k = *n;
      if (findfirst(i, 2, n, used) == true)
      break;
      used[i] = false;
      *n = k;
      }
      if (i == rule_number)
      return false;
      ++col;
      }
      return true;
      }

      int main() {
      bool used[rule_number];
      int n = 0;
      for (int i = 2; rule[0][i] != '$' && rule[0][i] != ''; ++i) {
      for (int j = 0; j != rule_number; ++j)
      used[j] = false;
      used[0] = true;
      findfirst(0, i, &n, used);
      }
      std::cout << first_set << std::endl;
      return 0;
      }





      share|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        You could do like the following code:



        used[i] means the rule[i] is used or not



        The method is Depth-first search, see https://en.wikipedia.org/wiki/Depth-first_search



        #include <iostream>

        #define MAX_SIZE 1024

        char rule[10] = {
        "S AC$",
        "C c",
        "C ;",
        "A aBCd",
        "A BQ",
        "B bB",
        "B ;",
        "Q q",
        "Q ;"
        };

        constexpr int rule_number = sizeof(rule) / sizeof(rule[0]);

        char first_set[MAX_SIZE];

        bool findfirst(int row, int col, int *n, bool* used) {
        for (;;) {
        char ch = rule[row][col];
        if (ch == '$' || ch == ';' || ch == '') {
        first_set[*n] = '';
        break;
        }
        if (islower(ch)) {
        first_set[(*n)++] = ch;
        ++col;
        continue;
        }
        int i;
        for (i = 0; i != rule_number; ++i) {
        if (used[i] == true || rule[i][0] != ch)
        continue;
        used[i] = true;
        int k = *n;
        if (findfirst(i, 2, n, used) == true)
        break;
        used[i] = false;
        *n = k;
        }
        if (i == rule_number)
        return false;
        ++col;
        }
        return true;
        }

        int main() {
        bool used[rule_number];
        int n = 0;
        for (int i = 2; rule[0][i] != '$' && rule[0][i] != ''; ++i) {
        for (int j = 0; j != rule_number; ++j)
        used[j] = false;
        used[0] = true;
        findfirst(0, i, &n, used);
        }
        std::cout << first_set << std::endl;
        return 0;
        }





        share|improve this answer












        You could do like the following code:



        used[i] means the rule[i] is used or not



        The method is Depth-first search, see https://en.wikipedia.org/wiki/Depth-first_search



        #include <iostream>

        #define MAX_SIZE 1024

        char rule[10] = {
        "S AC$",
        "C c",
        "C ;",
        "A aBCd",
        "A BQ",
        "B bB",
        "B ;",
        "Q q",
        "Q ;"
        };

        constexpr int rule_number = sizeof(rule) / sizeof(rule[0]);

        char first_set[MAX_SIZE];

        bool findfirst(int row, int col, int *n, bool* used) {
        for (;;) {
        char ch = rule[row][col];
        if (ch == '$' || ch == ';' || ch == '') {
        first_set[*n] = '';
        break;
        }
        if (islower(ch)) {
        first_set[(*n)++] = ch;
        ++col;
        continue;
        }
        int i;
        for (i = 0; i != rule_number; ++i) {
        if (used[i] == true || rule[i][0] != ch)
        continue;
        used[i] = true;
        int k = *n;
        if (findfirst(i, 2, n, used) == true)
        break;
        used[i] = false;
        *n = k;
        }
        if (i == rule_number)
        return false;
        ++col;
        }
        return true;
        }

        int main() {
        bool used[rule_number];
        int n = 0;
        for (int i = 2; rule[0][i] != '$' && rule[0][i] != ''; ++i) {
        for (int j = 0; j != rule_number; ++j)
        used[j] = false;
        used[0] = true;
        findfirst(0, i, &n, used);
        }
        std::cout << first_set << std::endl;
        return 0;
        }






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 19 at 15:56









        Yunbin Liu

        1,012313




        1,012313






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53372838%2fhow-to-find-first-set%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Wiesbaden

            Marschland

            Dieringhausen