Inserimento stringa per flex lexer

52

Le seguenti routine sono disponibili per l'impostazione di buffer d'ingresso per la scansione stringhe in memoria invece dei file (come fa yy_create_buffer):

YY_BUFFER_STATE yy_scan_string(const char *str): scansiona un NUL-terminato STRING`
YY_BUFFER_STATE yy_scan_bytes(const char *bytes, int len): scansioni len byte (compresi eventualmente NUL) a partire dai byte di posizione

Nota che entrambe queste funzioni creano, restituiscono un handle YY_BUFFER_STATE corrispondente (che è necessario eliminare con yy_delet e_buffer() quando fatto con esso) quindi yylex() scansiona una copia della stringa o dei byte. Questo comportamento può essere desiderabile poiché yylex() modifica il contenuto del buffer che sta analizzando).

Se si vuole evitare la copia (e yy_delete_buffer) utilizzando:

YY_BUFFER_STATE yy_scan_buffer(char *base, yy_size_t size)

campione principale:

int main() { 
    yy_scan_buffer("a test string"); 
    yylex(); 
}

fonte

2009-04-23 08:10:37 dfa

+0

Hey dfa (che è adatto considerando il suo flex) potresti aggiungere qualcosa sul requisito del doppio zero ? –

+3

Probabilmente la main sample fallirà, poiché 'yy_scan_buffer' richiede un buffer scrivibile (modifica temporaneamente il buffer, inserisce NUL per terminare' yytext' e quindi ripristina i caratteri originali), E richiede DUE terminando byte NUL. –

+0

A proposito, nel mio programma C++, dovevo dichiarare 'yy_scan_bytes' con un parametro' size_t len' per evitare errori di linker. – coldfix

17

Vedere this section del manuale di Flex per informazioni su come eseguire la scansione di buffer in memoria, ad esempio stringhe.

fonte

2009-04-23 07:38:53 unwind

+0

Oh uomo, non posso credere di averlo perso. – bjorns

+0

http://ftp.gnu.org/old-gnu/Manuals/flex-2.5.4/html_node/flex_12.html – Bastiano9

8

Ecco cosa dovevo fare:

extern yy_buffer_state; 
typedef yy_buffer_state *YY_BUFFER_STATE; 
extern int yyparse(); 
extern YY_BUFFER_STATE yy_scan_buffer(char *, size_t); 

int main(int argc, char** argv) { 

    char tstr[] = "line i want to parse\n\0\0"; 
    // note yy_scan_buffer is is looking for a double null string 
    yy_scan_buffer(tstr, sizeof(tstr)); 
    yy_parse(); 
    return 0; 
}

non è possibile esternare il typedef, il che ha senso quando ci si pensa.

fonte

2012-02-15 21:05:15 jlholland

+1

non è possibile esternare senza alcun tipo definito [riga 1]. –

+1

È sufficiente specificare un singolo ''\ 0'' poiché il secondo è implicitamente lì (a causa della stringa letterale). – a3f

8

flex può analizzare char * utilizzando una qualsiasi delle tre funzioni: yy_scan_string(), yy_scan_buffer(), e yy_scan_bytes() (vedi documentation). Ecco un esempio del primo:

typedef struct yy_buffer_state * YY_BUFFER_STATE; 
extern int yyparse(); 
extern YY_BUFFER_STATE yy_scan_string(char * str); 
extern void yy_delete_buffer(YY_BUFFER_STATE buffer); 

int main(){ 
    char string[] = "String to be parsed."; 
    YY_BUFFER_STATE buffer = yy_scan_string(string); 
    yyparse(); 
    yy_delete_buffer(buffer); 
    return 0; 
}

Le istruzioni equivalenti per yy_scan_buffer() (che richiede una stringa doppiamente terminazione Null):

char string[] = "String to be parsed.\0"; 
YY_BUFFER_STATE buffer = yy_scan_buffer(string, sizeof(string));

mia risposta ribadisce alcune delle informazioni fornite da @dfa e @ jlholland, ma nessuno dei codici delle loro risposte sembrava funzionare per me.

fonte

2014-05-08 15:16:48 sevko

+1

come riconoscerà quale struct yy_buffer_state è? –

+0

Non lo farà, * ma non gli interessa *. Tutto quello che faremo è dichiarare [opaque pointers] (http://en.wikipedia.org/wiki/Opaque_pointer) su una struttura sconosciuta chiamata 'yy_buffer_state', che il compilatore sa essere larga 4 byte (o qualunque sia il puntatore del tuo sistema la dimensione è): dato che non accediamo mai a nessuna delle sue variabili membro, non è necessario conoscere la composizione di quella struttura. Puoi sostituire 'struct yy_buffer_state' nel' typedef' con 'void' o' int' o altro, poiché la dimensione del puntatore sarà la stessa per ciascuno. – sevko

1

In alternativa, è possibile ridefinire la funzione YY_INPUT nel file lex, quindi impostare la stringa sull'input di LEX. Come di seguito:

#undef YY_INPUT 
#define YY_INPUT(buf) (my_yyinput(buf)) 

char my_buf[20]; 

void set_lexbuf(char *org_str) 
{ strcpy(my_buf, org_str); } 

void my_yyinput (char *buf) 
{ strcpy(buf, my_buf);  }

Nel vostro principale.c, prima della scansione, è necessario impostare prima il buffer di lex:

set_lexbuf(your_string); 
scanning...

fonte

2014-10-16 05:16:21 Kingcesc

3

La risposta accettata non è corretta. Ciò causerà perdite di memoria.

Internamente, yy_scan_string chiama yy_scan_bytes che, a sua volta, chiama yy_scan_buffer.

yy_scan_bytes alloca la memoria per una COPIA del buffer di input.

yy_scan_buffer funziona direttamente sul buffer fornito.

Con tutte e tre le forme, è necessario chiamare yy_delete_buffer per liberare le informazioni di stato del buffer flessibile (YY_BUFFER_STATE).

Tuttavia, con yy_scan_buffer, si evita l'allocazione interna/copia/free del buffer interno.

Il prototipo di yy_scan_buffer NON accetta un const char * e NON DEVE aspettarsi che il contenuto rimanga invariato.

Se si è allocata memoria per contenere la stringa, si ha la responsabilità di liberarla DOPO che si chiama yy_delete_buffer.

Inoltre, non dimenticare di fare in modo che yywrap restituisca 1 (non zero) quando si analizza SOLO questa stringa.

Di seguito è riportato un esempio COMPLETO.

%% 

<<EOF>> return 0; 

. return 1; 

%% 

int yywrap() 
{ 
    return (1); 
} 

int main(int argc, const char* const argv[]) 
{ 
    FILE* fileHandle = fopen(argv[1], "rb"); 
    if (fileHandle == NULL) { 
     perror("fopen"); 
     return (EXIT_FAILURE); 
    } 

    fseek(fileHandle, 0, SEEK_END); 
    long fileSize = ftell(fileHandle); 
    fseek(fileHandle, 0, SEEK_SET); 

    // When using yy_scan_bytes, do not add 2 here ... 
    char *string = malloc(fileSize + 2); 

    fread(string, fileSize, sizeof(char), fileHandle); 

    fclose(fileHandle); 

    // Add the two NUL terminators, required by flex. 
    // Omit this for yy_scan_bytes(), which allocates, copies and 
    // apends these for us. 
    string[fileSize] = '\0'; 
    string[fileSize + 1] = '\0'; 

    // Our input file may contain NULs ('\0') so we MUST use 
    // yy_scan_buffer() or yy_scan_bytes(). For a normal C (NUL- 
    // terminated) string, we are better off using yy_scan_string() and 
    // letting flex manage making a copy of it so the original may be a 
    // const char (i.e., literal) string. 
    YY_BUFFER_STATE buffer = yy_scan_buffer(string, fileSize + 2); 

    // This is a flex source file, for yacc/bison call yyparse() 
    // here instead ... 
    int token; 
    do { 
     token = yylex(); // MAY modify the contents of the 'string'. 
    } while (token != 0); 

    // After flex is done, tell it to release the memory it allocated.  
    yy_delete_buffer(buffer); 

    // And now we can release our (now dirty) buffer. 
    free(string); 

    return (EXIT_SUCCESS); 
}

fonte

2016-03-27 23:33:56

+0

Dovresti chiamare yylex/yyparse ogni volta da solo? – velocirabbit

0

ecco un piccolo esempio per l'utilizzo bisonti/flex come un parser dentro il codice cpp per il parsing stringa e cambiare un valore di stringa secondo essa (poche parti del codice sono state rimosse così ci sono parti irrilevanti lì.) parser.y:

%{ 
#include "parser.h" 
#include "lex.h" 
#include <math.h> 
#include <fstream> 
#include <iostream> 
#include <string> 
#include <vector> 
using namespace std; 
int yyerror(yyscan_t scanner, string result, const char *s){ 
    (void)scanner; 
    std::cout << "yyerror : " << *s << " - " << s << std::endl; 
    return 1; 
    } 
    %} 

%code requires{ 
#define YY_TYPEDEF_YY_SCANNER_T 
typedef void * yyscan_t; 
#define YYERROR_VERBOSE 0 
#define YYMAXDEPTH 65536*1024 
#include <math.h> 
#include <fstream> 
#include <iostream> 
#include <string> 
#include <vector> 
} 
%output "parser.cpp" 
%defines "parser.h" 
%define api.pure full 
%lex-param{ yyscan_t scanner } 
%parse-param{ yyscan_t scanner } {std::string & result} 

%union { 
    std::string * sval; 
} 

%token TOKEN_ID TOKEN_ERROR TOKEN_OB TOKEN_CB TOKEN_AND TOKEN_XOR TOKEN_OR TOKEN_NOT 
%type <sval> TOKEN_ID expression unary_expression binary_expression 
%left BINARY_PRIO 
%left UNARY_PRIO 
%% 

top: 
expression {result = *$1;} 
; 
expression: 
TOKEN_ID {$$=$1; } 
| TOKEN_OB expression TOKEN_CB {$$=$2;} 
| binary_expression {$$=$1;} 
| unary_expression {$$=$1;} 
; 

unary_expression: 
TOKEN_NOT expression %prec UNARY_PRIO {result = " (NOT " + *$2 + ") " ; $$ = &result;} 
; 
binary_expression: 
expression expression %prec BINARY_PRIO {result = " (" + *$1+ " AND " + *$2 + ") "; $$ = &result;} 
| expression TOKEN_AND expression %prec BINARY_PRIO {result = " (" + *$1+ " AND " + *$3 + ") "; $$ = &result;} 
| expression TOKEN_OR expression %prec BINARY_PRIO {result = " (" + *$1 + " OR " + *$3 + ") "; $$ = &result;} 
| expression TOKEN_XOR expression %prec BINARY_PRIO {result = " (" + *$1 + " XOR " + *$3 + ") "; $$ = &result;} 
; 

%% 

lexer.l : 

%{ 
#include <string> 
#include "parser.h" 

%} 
%option outfile="lex.cpp" header-file="lex.h" 
%option noyywrap never-interactive 
%option reentrant 
%option bison-bridge 

%top{ 
/* This code goes at the "top" of the generated file. */ 
#include <stdint.h> 
} 

id  ([a-zA-Z][a-zA-Z0-9]*)+ 
white  [ \t\r] 
newline [\n] 

%% 
{id}     {  
    yylval->sval = new std::string(yytext); 
    return TOKEN_ID; 
} 
"(" {return TOKEN_OB;} 
")" {return TOKEN_CB;} 
"*" {return TOKEN_AND;} 
"^" {return TOKEN_XOR;} 
"+" {return TOKEN_OR;} 
"!" {return TOKEN_NOT;} 

{white}; // ignore white spaces 
{newline}; 
. { 
return TOKEN_ERROR; 
} 

%% 

usage : 
void parse(std::string& function) { 
    string result = ""; 
    yyscan_t scanner; 
    yylex_init_extra(NULL, &scanner); 
    YY_BUFFER_STATE state = yy_scan_string(function.c_str() , scanner); 
    yyparse(scanner,result); 
    yy_delete_buffer(state, scanner); 
    yylex_destroy(scanner); 
    function = " " + result + " "; 
} 

makefile: 
parser.h parser.cpp: parser.y 
    @ /usr/local/bison/2.7.91/bin/bison -y -d parser.y 


lex.h lex.cpp: lexer.l 
    @ /usr/local/flex/2.5.39/bin/flex lexer.l 

clean: 
    - \rm -f *.o parser.h parser.cpp lex.h lex.cpp

fonte

2016-06-24 10:29:04

Inserimento stringa per flex lexer

risposta

Problemi correlati