正则表达式的问题在于无法处理超过两个元素的路径。它将把路径分为数据库名称和表名称(如果有的话)。此外,这个正则表达式无法处理SQLite特殊文件名,例如“:memory”(这些对于测试非常有用)。
为了拥有可维护的正则表达式方法,最好的方法是使用一个分配表,其中包含需要不同解析的主要协议,并为每种不同的方法编写一个子程序。同时,使用//x的正则表达式会有所帮助,因为它可以包含注释,帮助提高可维护性。
sub test_re{
my $url =shift;
my $x={};
@$x{qw(conn driver user pass host port dbname table_name tparam_name tparam_value conn_param_string)} =
$url =~ m{
^(
(\w*)
:
(?:
(\w+) # user
(?:
\:
([^/\@]*) # password
)?
\@
)? # could not have user,pass
(?:
([\w\-\.]+) #host
(?:
\:
(\d+) # port
)? # port optional
)? # host and port optional
/ # become in a third '/' if no user pass host and port
(\w*) # get the db (only until the first '/' is any). Will not work with full paths for sqlite.
)
(?:
/ # if tables
(\w+) # get table
(?:
\? # parameters
(\w+)
=
(\w+)
)? # parameter is conditional but would have always a tablename
)? # conditinal table and parameter
(
(?:
;
(\w+)
=
(\w+)
)* # rest of parameters if any
)
$
}x;
return $x;
}
但我建议使用URI::Split(比URI更简洁),然后根据需要分割路径。
您可以在此处查看使用RE与URI::Split的区别:
use feature ':5.10';
use strict;
use URI::Split qw(uri_join uri_split);
use Data::Dumper;
my $urls = [qw(
mysql://anonymous@my.self.com:1234/dbname
mysql://anonymous@my.self.com:1234/dbname/tablename
mysql://anonymous@my.self.com:1234/dbname/pathextra/tablename
sqlite:///dbname_which_is_a_file
sqlite:///tmp/dbname_which_is_a_file
sqlite:///tmp/db/dbname_which_is_a_file
sqlite:///:dbname_which_is_a_file
sqlite:///:memory
)];
foreach my $url (@$urls) {
print Dumper(test_re($url));
print Dumper(uri_split($url));
}
结果:
[...]
== testing sqlite:
- RE
$VAR1 = ;
- URI::Split
$VAR1 = 'sqlite';
$VAR2 = '';
$VAR3 = '/dbname_which_is_a_file';
$VAR4 = undef;
$VAR5 = undef;
== testing sqlite:
- RE
$VAR1 = ;
- URI::Split
$VAR1 = 'sqlite';
$VAR2 = '';
$VAR3 = '/tmp/dbname_which_is_a_file';
$VAR4 = undef;
$VAR5 = undef;
== testing sqlite:
- RE
$VAR1 = ;
- URI::Split
$VAR1 = 'sqlite';
$VAR2 = '';
$VAR3 = '/tmp/db/dbname_which_is_a_file';
$VAR4 = undef;
$VAR5 = undef;
== testing sqlite:
- RE
$VAR1 = ;
- URI::Split
$VAR1 = 'sqlite';
$VAR2 = '';
$VAR3 = '/:memory';
$VAR4 = undef;
$VAR5 = undef;