esokos
March 7, 2023, 5:15pm
1
Hello all,
is there a way to do selection from a multi segment file based on segment header values?
e.g. i have such a file and want to select segments with -Z > 0.3 (hope it is not too trivial !!!)
>-Z.390000
38.01660 38.21025 0.50000
38.12926 38.22511 0.50000
38.12893 38.22655 9.50000
38.01627 38.21169 9.50000
>-Z.005270
38.00179 38.20296 0.58000
37.96190 38.11862 0.58000
37.98032 38.11319 9.42000
38.02023 38.19752 9.42000
>-Z.037600
37.26096 38.01535 0.58000
37.36783 37.98445 0.58000
37.37458 37.99904 9.42000
37.26769 38.02994 9.42000
>-Z.656000
37.20471 38.01427 0.50000
37.31859 38.01423 0.50000
37.31859 38.01567 9.50000
37.20471 38.01571 9.50000
I would probably preprocess through a small perl program
something on the lines of:
$cap=0;
while(<>){
m/^>-Z(.*)/ and $cap=($1 > 0.3);
print $_ if($cap)
}
and then pipe to your chosen gmt module
I think not, but seems a reasonable feature request for gmt select -Z to accept a new modifier +h to pick up the z-value from the header instead of the third column(and hence it applies to each record in the segment).
This is also doable with csplit(1)
, I think.
Maybe a bit too much, but a fun exercise in bash
.
# split every segment into individual files
$ csplit --prefix=split_ --elide-empty-files f "/>-Z/0" "{*}"
123
123
123
123
# get the name of files containing Z.3 (i.e. 0.3)
$ grep -l "Z\.3" split_0*
split_00
Very fragile because I didn’t care to parse the header and convert to floats
julia> D = gmtread("zvals_.dat")
Vector{GMTdataset{Float64, 2}} with 4 segments
Show first segment. To see other segments just type its element number. E.g. D[7]
BoundingBox: [38.01627, 38.12926, 38.21025, 38.22655, 0.5, 9.5]
Header: -Z.390000
4×3 GMTdataset{Float64, 2}
Row │ col.1 col.2 col.3
│ Float64 Float64 Float64
─────┼───────────────────────────
1 │ 38.0166 38.2103 0.5
2 │ 38.1293 38.2251 0.5
3 │ 38.1289 38.2266 9.5
4 │ 38.0163 38.2117 9.5
julia> ind = [D[k].header > "-Z.3" for k=1:length(D)]
4-element Vector{Bool}:
1
0
0
1
julia> D03 = D[ind]
Vector{GMTdataset{Float64, 2}} with 2 segments
Show first segment. To see other segments just type its element number. E.g. D[7]
BoundingBox: [38.01627, 38.12926, 38.21025, 38.22655, 0.5, 9.5]
Header: -Z.390000
4×3 GMTdataset{Float64, 2}
Row │ col.1 col.2 col.3
│ Float64 Float64 Float64
─────┼───────────────────────────
1 │ 38.0166 38.2103 0.5
2 │ 38.1293 38.2251 0.5
3 │ 38.1289 38.2266 9.5
4 │ 38.0163 38.2117 9.5
I cannot see csplit working for things like get segments where -Zval is between 4 and 6.7. Hard to beat
gmt select oldfile.txt -Z4/6.7+h > those only.txt
even in Julia or Python.
But there is no +h modifier in gmtselect -Z
! And -Z operates on 3rd column while the question was on how operate on header encoded option.
How was the GMT vector file with the -Z
values created? It might be easier to change the code that created the file in the first place.
esokos
March 8, 2023, 7:04am
9
Hello all,
THANK you for all suggestions !! i finally solved it using a small external code. Although such a code is easy to write (Maybe!) it would be nice for GMT to have such a feature included.
esokos
March 8, 2023, 7:04am
10
yes it would make sense but file was given to me !!
I know there is no +h. See earlier about me proposing it.
Such a filter option would be a great feature to have!
For those who wish to try this feature, check out this branch, build and explore: https://github.com/GenericMappingTools/gmt/pull/7309
Now approved and merged into the master branch so should be easy to get to for @chhei-s and @esokos to try out!