I’m trying to get the value as regex as follow:
from textx import metamodel_from_str def test_get_hosts2(): grammar = r""" config: ( /(?!host)./ | hosts+=host | 'host' )* ; host: 'host' host2name=/[0-9a-zA-Z.-]+/ '{' ( 'fixed-address' fixed_address=/([0-9]{1,3}.){3}[0-9]{1,3}/';' ('option host-name' option_host_name=STRING';')? ('option domain-name-servers' option_domain_name_servers=/([0-9]{1,3}.){3}[0-9]{1,3}, ([0-9]{1,3}.){3}[0-9]{1,3}/';')? ('option netbios-name-servers' option_netbios_name_servers=/([0-9]{1,3}.){3}[0-9]{1,3}/';')? ('option domain-name' option_domain_name=STRING';')? )# '}' ; """ conf_file = r""" host corehost.abc.abc.ab { fixed-address 172.124.106.10; option host-name "hostname.abc.abc.ab"; option domain-name-servers 123.123.123.120, 123.123.128.142; option netbios-name-servers 172.124.106.156; option domain-name "abcm1.abc.abc.ab"; option domain-search "abcm1.abc.abc.ab", "abcmo2.abc.abc.ab", "abcmo.3abc.abc.ab", "abcmo4.abc.abc.ab"; } host corehost2.abc.abc.ab { fixed-address 172.124.106.120; option host-name "hostname2.abc.abc.ab"; option domain-name-servers 123.123.123.220, 123.123.128.242; option netbios-name-servers 172.124.106.256; option domain-name "abcm2.abc.abc.ab"; option domain-search "abcm2.abc.abc.ab", "abcmo2.abc.abc.ab", "abcm.3abc.abc.ab", "abcm4.abc.abc.ab"; } """ mm = metamodel_from_str(grammar) model = mm.model_from_str(conf_file) print(model.hosts) # assert len(model.hosts) == 2 for host in model.hosts: print(host) print(host.host2name, host.fixed_address, host.option_domain_name_servers, host.option_domain_search) if __name__ == "__main__": test_get_hosts2()
But I can get the only single value such as “fixed-address” and “host2name”. In “domain-name-servers” I did with “,” in regex. But I think it isn’t the right way because the values are not same count. Could you help me to get the value of “domain-name-servers” and “domain-search” with right regex?
ref: Parsing dhcpd.conf with textX
Advertisement
Answer
The easiest way is to use textX’s repetition modifiers for matching a sequence of comma-separated values. Basically, whenever you match zero-or-more or one-or-more etc. you can add modifier in the square brackets. The most frequently used modifier is Separator modifier which basically is a match that is used between each two elements.
The other side bonuses instead of trying to match everything with a single regex are:
- simplicity (easier to maintain)
- you get a nice Python list of elements so you don’t need to process the matched string further.
The working grammar would be (notice the use of +[',']
which means one-or-more with a comma as a separator
):
config: ( /(?!host)./ | hosts+=host | 'host' )* ; host: 'host' host2name=/[0-9a-zA-Z.-]+/ '{' ( 'fixed-address' fixed_address=ip_addr';' ('option' 'host-name' option_host_name=STRING';')? ('option' 'domain-name-servers' option_domain_name_servers=ip_addr+[',']';')? ('option' 'netbios-name-servers' option_netbios_name_servers=ip_addr+[',']';')? ('option' 'domain-name' option_domain_name=STRING+[',']';')? ('option' 'domain-search' option_domain_search=STRING+[',']';')? )# '}'; ip_addr: /([0-9]{1,3}.){3}[0-9]{1,3}/;