This notebook provides an example of how to read in NIMS (.BIN) files into an MTH5. NIMS files represent a single run.
from mth5.mth5 import MTH5
from mth5.io.nims import NIMSCollection
from mth5 import read_file
2022-09-07 13:02:52,583 [line 135] mth5.setup_logger - INFO: Logging file can be found C:\Users\jpeacock\OneDrive - DOI\Documents\GitHub\mth5\logs\mth5_debug.log
We will use the NIMSCollection
to assemble the .bin files into a logical order by run. The output NIMS files include all data for each channel for a single run. Therefore the collection is relatively simple.
Metadata: we need to input the survey_id
to provide minimal metadata when making an MTH5 file.
The NIMSCollection.get_runs()
will return a two level ordered dictionary (OrderedDict
). The first level is keyed by station ID. These objects are in turn ordered dictionaries by run ID. Therefore you can loop over stations and runs.
Note: n_samples
and end
are estimates based on file size not the data. To get an accurate number you should read in the full file.
nc = NIMSCollection(r"c:\Users\jpeacock\OneDrive - DOI\mt\nims")
nc.survey_id = "test"
runs = nc.get_runs(sample_rates=[8])
print(f"Found {len(runs)} station with {len(runs[list(runs.keys())[0]])} runs")
2022-09-07 13:02:53,062 [line 123] mth5.io.nims.header.NIMS.read_header - INFO: Reading NIMS file c:\Users\jpeacock\OneDrive - DOI\mt\nims\mnp300a.BIN 2022-09-07 13:02:53,081 [line 242] mth5.io.nims.header.NIMS.end_time - WARNING: Estimating end time from n_samples 2022-09-07 13:02:53,086 [line 123] mth5.io.nims.header.NIMS.read_header - INFO: Reading NIMS file c:\Users\jpeacock\OneDrive - DOI\mt\nims\mnp300b.BIN 2022-09-07 13:02:53,099 [line 242] mth5.io.nims.header.NIMS.end_time - WARNING: Estimating end time from n_samples
Found 1 station with 2 runs
for run_id, run_df in runs["mnp300"].items():
display(run_df)
survey | station | run | start | end | channel_id | component | fn | sample_rate | file_size | n_samples | sequence_number | instrument_id | calibration_fn | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | test | mnp300 | mnp300a | 2019-09-26 18:29:29+00:00 | 2019-10-01 15:03:23+00:00 | 1 | hx,hy,hz,ex,ey,temperature | c:\Users\jpeacock\OneDrive - DOI\mt\nims\mnp30... | 8 | 54972155 | 3357078 | 1 | NIMS | None |
survey | station | run | start | end | channel_id | component | fn | sample_rate | file_size | n_samples | sequence_number | instrument_id | calibration_fn | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | test | mnp300 | mnp300b | 2019-10-01 16:16:42+00:00 | 2019-10-03 22:55:52+00:00 | 1 | hx,hy,hz,ex,ey,temperature | c:\Users\jpeacock\OneDrive - DOI\mt\nims\mnp30... | 8 | 25774314 | 1574003 | 2 | NIMS | None |
Now that we have a logical collection of files, lets load them into an MTH5. We will simply loop of the stations, runs, and channels in the ordered dictionary.
There are a few things that to keep in mind:
survey_group
so add them there.TODO:
- make sure filters get propagated throught to mth5
- think about run names
calibrate = True
m = MTH5()
if calibrate:
m.data_level = 2
m.open_mth5(nc.file_path.joinpath("from_nims.h5"))
2022-09-07 13:02:53,674 [line 663] mth5.mth5.MTH5._initialize_file - INFO: Initialized MTH5 0.2.0 file c:\Users\jpeacock\OneDrive - DOI\mt\nims\from_nims.h5 in mode a
survey_group = m.add_survey(nc.survey_id)
%%time
for station_id in runs.keys():
station_group = survey_group.stations_group.add_station(station_id)
for run_id, run_df in runs[station_id].items():
run_group = station_group.add_run(run_id)
run_ts = read_file(run_df.fn.unique()[0])
if calibrate:
run_ts = run_ts.calibrate()
run_group.from_runts(run_ts)
station_group.metadata.update(run_ts.station_metadata)
station_group.write_metadata()
2022-09-07 13:02:54,234 [line 123] mth5.io.nims.header.NIMS.read_header - INFO: Reading NIMS file c:\Users\jpeacock\OneDrive - DOI\mt\nims\mnp300a.BIN 2022-09-07 13:02:54,649 [line 927] mth5.io.nims.header.NIMS.read_nims - WARNING: odd number of bytes 54971209, not even blocks cutting down the data by 72 bits 2022-09-07 13:02:55,667 [line 1019] mth5.io.nims.header.NIMS.read_nims - INFO: Reading took 1.43 seconds 2022-09-07 13:02:56,835 [line 288] mth5.timeseries.run_ts.RunTS.validate_metadata - WARNING: end time of dataset 2019-10-01T15:07:08+00:00 does not match metadata end 2019-10-01T15:07:07.875000+00:00 updating metatdata value to 2019-10-01T15:07:08+00:00 2022-09-07 13:02:57,909 [line 288] mth5.timeseries.run_ts.RunTS.validate_metadata - WARNING: end time of dataset 2019-10-01T15:07:08+00:00 does not match metadata end 2019-10-01T15:07:07.875000+00:00 updating metatdata value to 2019-10-01T15:07:08+00:00 2022-09-07 13:03:07,154 [line 123] mth5.io.nims.header.NIMS.read_header - INFO: Reading NIMS file c:\Users\jpeacock\OneDrive - DOI\mt\nims\mnp300b.BIN 2022-09-07 13:03:07,769 [line 1019] mth5.io.nims.header.NIMS.read_nims - INFO: Reading took 0.61 seconds 2022-09-07 13:03:08,620 [line 288] mth5.timeseries.run_ts.RunTS.validate_metadata - WARNING: end time of dataset 2019-10-03T23:01:04+00:00 does not match metadata end 2019-10-03T23:01:03.875000+00:00 updating metatdata value to 2019-10-03T23:01:04+00:00 2022-09-07 13:03:09,250 [line 288] mth5.timeseries.run_ts.RunTS.validate_metadata - WARNING: end time of dataset 2019-10-03T23:01:04+00:00 does not match metadata end 2019-10-03T23:01:03.875000+00:00 updating metatdata value to 2019-10-03T23:01:04+00:00
Wall time: 26.9 s
%%time
station_group.validate_station_metadata()
station_group.write_metadata()
survey_group.update_survey_metadata()
survey_group.write_metadata()
Wall time: 7.68 s
Have a look at the MTH5 structure and make sure it looks correct.
m
/: ==================== |- Group: Experiment -------------------- |- Group: Reports ----------------- |- Group: Standards ------------------- --> Dataset: summary ...................... |- Group: Surveys ----------------- |- Group: test -------------- |- Group: Filters ----------------- |- Group: coefficient --------------------- |- Group: dipole_101.00 ----------------------- |- Group: dipole_106.00 ----------------------- |- Group: dipole_109.00 ----------------------- |- Group: e_analog_to_digital ----------------------------- |- Group: h_analog_to_digital ----------------------------- |- Group: to_mt_units --------------------- |- Group: fap ------------- |- Group: fir ------------- |- Group: time_delay -------------------- |- Group: ex_time_offset ------------------------ |- Group: ey_time_offset ------------------------ |- Group: hx_time_offset ------------------------ |- Group: hy_time_offset ------------------------ |- Group: hz_time_offset ------------------------ |- Group: zpk ------------- |- Group: nims_1_pole_butterworth --------------------------------- --> Dataset: poles .................... --> Dataset: zeros .................... |- Group: nims_3_pole_butterworth --------------------------------- --> Dataset: poles .................... --> Dataset: zeros .................... |- Group: nims_5_pole_butterworth --------------------------------- --> Dataset: poles .................... --> Dataset: zeros .................... |- Group: Reports ----------------- |- Group: Standards ------------------- --> Dataset: summary ...................... |- Group: Stations ------------------ |- Group: mnp300 ---------------- |- Group: Transfer_Functions ---------------------------- |- Group: mnp300a ----------------- --> Dataset: ex ................. --> Dataset: ey ................. --> Dataset: hx ................. --> Dataset: hy ................. --> Dataset: hz ................. --> Dataset: temperature .......................... |- Group: mnp300b ----------------- --> Dataset: ex ................. --> Dataset: ey ................. --> Dataset: hx ................. --> Dataset: hy ................. --> Dataset: hz ................. --> Dataset: temperature .......................... --> Dataset: channel_summary .............................. --> Dataset: tf_summary .........................
Have a look at the channel summary and make sure everything looks good.
m.channel_summary.summarize()
m.channel_summary.to_dataframe()
survey | station | run | latitude | longitude | elevation | component | start | end | n_samples | sample_rate | measurement_type | azimuth | tilt | units | hdf5_reference | run_hdf5_reference | station_hdf5_reference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | test | mnp300 | mnp300a | 34.726823 | -115.735015 | 940.0 | ex | 2019-09-26 18:33:21+00:00 | 2019-10-01 15:07:08+00:00 | 3357016 | 8.0 | electric | 0.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
1 | test | mnp300 | mnp300a | 34.726823 | -115.735015 | 940.0 | ey | 2019-09-26 18:33:21+00:00 | 2019-10-01 15:07:08+00:00 | 3357016 | 8.0 | electric | 90.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
2 | test | mnp300 | mnp300a | 34.726823 | -115.735015 | 940.0 | hx | 2019-09-26 18:33:21+00:00 | 2019-10-01 15:07:08+00:00 | 3357016 | 8.0 | magnetic | 0.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
3 | test | mnp300 | mnp300a | 34.726823 | -115.735015 | 940.0 | hy | 2019-09-26 18:33:21+00:00 | 2019-10-01 15:07:08+00:00 | 3357016 | 8.0 | magnetic | 90.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
4 | test | mnp300 | mnp300a | 34.726823 | -115.735015 | 940.0 | hz | 2019-09-26 18:33:21+00:00 | 2019-10-01 15:07:08+00:00 | 3357016 | 8.0 | magnetic | 0.0 | 90.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
5 | test | mnp300 | mnp300a | 34.726823 | -115.735015 | 940.0 | temperature | 2019-09-26 18:33:21+00:00 | 2019-10-01 15:07:08+00:00 | 3357016 | 8.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
6 | test | mnp300 | mnp300b | 34.726823 | -115.735015 | 940.0 | ex | 2019-10-01 16:22:01+00:00 | 2019-10-03 23:01:04+00:00 | 1573944 | 8.0 | electric | 0.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
7 | test | mnp300 | mnp300b | 34.726823 | -115.735015 | 940.0 | ey | 2019-10-01 16:22:01+00:00 | 2019-10-03 23:01:04+00:00 | 1573944 | 8.0 | electric | 90.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
8 | test | mnp300 | mnp300b | 34.726823 | -115.735015 | 940.0 | hx | 2019-10-01 16:22:01+00:00 | 2019-10-03 23:01:04+00:00 | 1573944 | 8.0 | magnetic | 0.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
9 | test | mnp300 | mnp300b | 34.726823 | -115.735015 | 940.0 | hy | 2019-10-01 16:22:01+00:00 | 2019-10-03 23:01:04+00:00 | 1573944 | 8.0 | magnetic | 90.0 | 0.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
10 | test | mnp300 | mnp300b | 34.726823 | -115.735015 | 940.0 | hz | 2019-10-01 16:22:01+00:00 | 2019-10-03 23:01:04+00:00 | 1573944 | 8.0 | magnetic | 0.0 | 90.0 | count | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
11 | test | mnp300 | mnp300b | 34.726823 | -115.735015 | 940.0 | temperature | 2019-10-01 16:22:01+00:00 | 2019-10-03 23:01:04+00:00 | 1573944 | 8.0 | auxiliary | 0.0 | 0.0 | none | <HDF5 object reference> | <HDF5 object reference> | <HDF5 object reference> |
This is important, you should close the file after you are done using it. Otherwise bad things can happen if you try to open it with another program or Python interpreter.
m.close_mth5()
2022-09-07 13:03:29,078 [line 744] mth5.mth5.MTH5.close_mth5 - INFO: Flushing and closing c:\Users\jpeacock\OneDrive - DOI\mt\nims\from_nims.h5
run_ts.plot()