两个输入文件都相同,以下是它们的外观。两个文件都有约45k行。
Europe;3;France;23;Parijs;42545;48,856555;2,350976
Europe;3;France;23;Parisot;84459;44,264381;1,857827
Europe;3;France;23;Parlan;11337;44,828976;2,172435
Europe;3;France;23;Parnac;35670;46,4533;1,4425
Europe;3;France;23;Parnans;22065;45,1097;5,1456
假设这些目的地中有两个彼此靠近,我想要这样输出它们:
Europe;3;France;23;Parijs;42545;48,856555;2,350976;Parlan;11337;200
Europe;3;France;23;Parisot;84459;44,264381;1,857827;
Europe;3;France;23;Parlan;11337;44,828976;2,172435;
Europe;3;France;23;Parnac;35670;46,4533;1,4425;Parisot;84459;2000;Parnans;22065;350
Europe;3;France;23;Parnans;22065;45,1097;5,1456;
实际结果当然会匹配多于2个。在输出文件中,已匹配的目的地、目的地ID和计算距离将被添加。每个目的地可能都有多个匹配项。这真的很难解释,哈哈。正如我所说,我使用了strict&;warnings并将错误缩小到最小,但仍未完全解决。以下是这些错误:
Global symbol "$infile1" requires explicit package name at E:\etc.pl line 17.
Execution of E:\etc.pl aborted due to compilation errors.
这是我目前的代码。我的声明让我很糊涂,需要帮助。
有人可以帮我吗?(也许这不是最有效的方法,但现在它可以帮助我逐步了解 Perl)
use strict;
use warnings;
use GIS::Distance::Lite qw(distance);
my $inputfile1 = shift || die "Give input!\n";
my $inputfile2 = shift || die "Give more input!\n";
my $outputfile = shift || die "Give output!\n";
open my $INFILE1, '<', $inputfile1 or die "In use/Not found :$!\n";
open my $INFILE2, '<', $inputfile2 or die "In use/Not found :$!\n";
open my $OUTFILE, '>', $outputfile or die "In use/Not found :$!\n";
my $maxdist = 5000;
my $mindist = 0.0001;
while ( my @infile1 ){
my @elements = split(";",$infile1);
my $lat1 = $elements[6];
my $lon1 = $elements[7];
$lat1 =~ s/,/./g;
$lon1 =~ s/,/./g;
seek my $infile2, 0, 0;
print "1. $lat1\n";
print "2. $lon1\n";
while ( my @infile2 ){
my @loopelements = split(";",$infile2);
my $lat2 = $loopelements[6];
my $lon2 = $loopelements[7];
$lat2 =~ s/,/./g;
$lon2 =~ s/,/./g;
print "3. $lat1\n";
print "4. $lon1\n";
my $distance = distance($lat1, $lon1 => $lat2, $lon2); # Afstand berekenen tussen latlon1 and latlon2
print "5. $distance\n";
my $afstand = sprintf("%.4f",$distance);
print "6. $afstand\n";
if (($afstand < $maxdist) and (!($elements[4] == $loopelements[4]))){
push (@elements, $afstand,$loopelements[4],$loopelements[5]);
print "7. $afstand\n";
} else {
next;
}
}
@elements = join(";",@elements); # add ';' to all elements
print OUTFILE "@elements";
#if ($i == 10) {last;}
}
close(INFILE1);
close(INFILE2);
close(OUTFILE);
--------------- EDIT --------------
您好,我又来了。我看了您的更新代码,这是我的一个相当复杂版本哈哈。老实说,我只理解了一半。它仍然非常有用!我决定保留原始脚本设计,并采纳您的改进,但它仍然无法正常运行。如果您不介意的话,我有几个问题:
我对脚本进行了一些调整。第一个是现在跳过具有零的latlons,因为那将得到无用的结果。在同一行中,它还跳过了空单元格,这也是无用的。我已经为两个输入文件都做了这个调整。
哦,当我说elements[4]时,我想说的是elements[5],所以它应该是数字。所以我把ne换成了!=,如果我没记错的话。但我认为我又创建了一个无限循环,因为它没有循环遍历第二个文件。我知道我可能看起来很固执,但我想先理解我的原始脚本,然后再开始使用您的版本。seek函数似乎无法正常工作。以下是当前的脚本。
use strict;
use warnings;
use GIS::Distance::Lite qw(distance);
my $inputfile1 = shift || die "Give input!\n";
my $inputfile2 = shift || die "Give more input!\n";
my $outputfile = shift || die "Give output!\n";
open my $INFILE1, '<', $inputfile1 or die "In use/Not found :$!\n";
open my $INFILE2, '<', $inputfile2 or die "In use/Not found :$!\n";
open my $OUTFILE, '>', $outputfile or die "In use/Not found :$!\n";
my $maxdist = 3000;
my $mindist = 0.0001;
while (my $infile1 = <$INFILE1> ){
chomp $infile1;
my @elements = split(";",$infile1);
print "1. $elements[6]\n";
print "2. $elements[7]\n";
my $lat1 = $elements[6];
my $lon1 = $elements[7];
if ((($lat1 and $lon1) ne '0') and (!($lat1 and $lon1) eq "")){
$lat1 =~ s/,/./;
$lon1 =~ s/,/./;
print "lat1: $lat1\n";
print "lon1: $lon1\n";
} else {
next;
}
print "3. $lat1\n";
print "4. $lon1\n";
seek $INFILE2, 0, 0;
while ( my $infile2 = <$INFILE2> ){
chomp $infile2;
my @loopelements = split(";",$infile2);
print "5. $elements[6]\n";
print "6. $elements[7]\n";
my $lat2 = $loopelements[6];
my $lon2 = $loopelements[7];
if ((($lat2 and $lon2) ne '0') and (!($lat2 and $lon2) eq "")){
$lat2 =~ s/,/./;
$lon2 =~ s/,/./;
print "lat2: $lat1\n";
print "lon2: $lon1\n";
} else {
next;
}
my $distance = distance($lat1, $lon1 => $lat2, $lon2); # Afstand berekenen tussen latlon1 and latlon2
print "7. $distance\n";
my $afstand = sprintf("%.4f",$distance);
print "8. $afstand\n";
if ($afstand < $maxdist && $elements[4] != $loopelements[4]){
push (@elements, $afstand, $loopelements[4],$loopelements[5]);
print "9. $afstand\n";
} else {
next;
}
}
print $OUTFILE join(";",@elements), "\n";
}
close($INFILE1);
close($INFILE2);
close($OUTFILE);
seek
。它用于重新读取一定长度的字节。 - simbabque