Skip to content

Commit 85d7d4c

Browse files
committed
Swapping out pandas
1 parent ee3f93f commit 85d7d4c

File tree

2 files changed

+51
-58
lines changed

2 files changed

+51
-58
lines changed

tutorials/euclid_access/1_Euclid_intro_MER_images.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,6 @@ Each MER image is approximately 1.47 GB. Downloading can take some time.
6565
import re
6666
6767
import numpy as np
68-
import pandas as pd
6968
7069
import matplotlib.pyplot as plt
7170
from matplotlib.patches import Ellipse
@@ -80,9 +79,6 @@ from astropy import units as u
8079
8180
from astroquery.ipac.irsa import Irsa
8281
import sep
83-
84-
# Copy-on-write is more performant and avoids unexpected modifications of the original DataFrame.
85-
pd.options.mode.copy_on_write = True
8682
```
8783

8884
## 1. Search for multiwavelength Euclid Q1 MER mosaics that cover the star HD 168151

tutorials/euclid_access/5_Euclid_intro_SPE_catalog.md

Lines changed: 51 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ If you have questions about this notebook, please contact the [IRSA helpdesk](ht
5050

5151
```{code-cell} ipython3
5252
# Uncomment the next line to install dependencies if needed
53-
# !pip install matplotlib pandas astropy 'astroquery>=0.4.10'
53+
# !pip install matplotlib astropy 'astroquery>=0.4.10'
5454
```
5555

5656
```{code-cell} ipython3
@@ -59,11 +59,10 @@ import urllib
5959
6060
import matplotlib.pyplot as plt
6161
import numpy as np
62-
import pandas as pd
6362
6463
from astropy.coordinates import SkyCoord
6564
from astropy.io import fits
66-
from astropy.table import Table
65+
from astropy.table import QTable
6766
from astropy import units as u
6867
from astropy.utils.data import download_file
6968
from astropy.visualization import ImageNormalize, PercentileInterval, AsinhStretch
@@ -80,41 +79,41 @@ search_radius = 10 * u.arcsec
8079
coord = SkyCoord.from_name('HD 168151')
8180
```
8281

83-
### Use IRSA to search for all Euclid data on this target
82+
```{tip}
83+
The IRSA SIA collections can be listed using using the ``list_collections`` method, we can filter on the ones containing "euclid" in the collection name:
8484
85-
This searches specifically in the euclid_DpdMerBksMosaic "collection" which is the MER images and catalogs.
85+
Irsa.list_collections(filter='euclid')
86+
```
8687

87-
```{code-cell} ipython3
88-
im_table = Irsa.query_sia(pos=(coord, search_radius), collection='euclid_DpdMerBksMosaic')
88+
+++
8989

90-
## Convert the table to pandas dataframe
91-
df_im_irsa=im_table.to_pandas()
92-
```
90+
### Use IRSA to search for all Euclid data on this target
91+
92+
This searches specifically in the ``euclid_DpdMerBksMosaic`` collection which is the MER images and catalogs.
9393

9494
```{code-cell} ipython3
95-
## Change the settings so we can see all the columns in the dataframe and the full column width
96-
## (to see the full long URL)
97-
pd.set_option('display.max_columns', None)
98-
pd.set_option('display.max_colwidth', None)
95+
image_table = Irsa.query_sia(pos=(coord, search_radius), collection='euclid_DpdMerBksMosaic')
9996
```
10097

101-
#### This dataframe contains other non-Euclid datasets that have been "Euclidized", meaning they have been put on the same pixel scale as the Euclid data. For this example we just want to look at the Euclid data, so select Euclid for the facility name, and choose science as the data product subtype.
98+
This table lists all MER mosaic images available in this search position. These mosaics include the Euclid VIS, Y, J, H images, as well as ground-based telescopes which have been put on the same pixel scale. For more information, see the [Euclid documentation at IPAC](https://euclid.caltech.edu/page/euclid-faq-tech/).
10299

103-
```{code-cell} ipython3
104-
df_im_euclid=df_im_irsa[ (df_im_irsa['dataproduct_subtype']=='science') & (df_im_irsa['facility_name']=='Euclid')]
100+
Note that there are various image types are returned as well, we filter out the `science` images from these:
105101

106-
df_im_euclid.head()
102+
```{code-cell} ipython3
103+
science_images = image_table[image_table['dataproduct_subtype'] == 'science']
104+
science_images
107105
```
108106

109-
## Choose the VIS image and pull the filename:
107+
### Choose the VIS image and pull the Tile ID
110108

111-
```{code-cell} ipython3
112-
filename=df_im_euclid[df_im_euclid['energy_bandpassname']=='VIS']['access_url'].to_list()[0]
109+
+++
113110

114-
# ## Extract the tileID from the filename
115-
tileID=re.search(r'TILE\s*(\d{9})', filename).group(1)
111+
Extract the tile ID from the ``obs_id`` column. The values in this column are made a combination of the 9 digit tile ID and the abbreviation of the instrument.
112+
113+
```{code-cell} ipython3
114+
tileID = science_images[science_images['energy_bandpassname'] == 'VIS']['obs_id'][0][:9]
116115
117-
print('The MER tile ID for this object is :',tileID)
116+
print(f'The MER tile ID for this object is : {tileID}')
118117
```
119118

120119
## 2. Download SPE catalog from IRSA directly to this notebook
@@ -137,10 +136,14 @@ table_lines = 'euclid_q1_spe_lines_line_features'
137136
- List the column names
138137

139138
```{code-cell} ipython3
140-
columns_info = Irsa.list_columns(catalog=table_lines)
139+
columns_info = Irsa.list_columns(catalog=table_mer)
141140
print(len(columns_info))
142141
```
143142

143+
```{code-cell} ipython3
144+
Irsa.list_columns(catalog=table_1dspectra, full=True)??
145+
```
146+
144147
```{code-cell} ipython3
145148
# Full list of columns and their description
146149
columns_info
@@ -159,28 +162,25 @@ We specify the following conditions on our search:
159162
Finally we sort the data by descending spe_line_snr_gf to have the largest SNR H-alpha lines detected at the top.
160163

161164
```{code-cell} ipython3
162-
adql = f"SELECT DISTINCT mer.object_id,mer.ra, mer.dec, mer.tileid, mer.flux_y_templfit, \
163-
lines.spe_line_snr_gf,lines.spe_line_snr_di, lines.spe_line_name, lines.spe_line_central_wl_gf,\
164-
lines.spe_line_ew_gf, galaxy.spe_z_err, galaxy.spe_z,galaxy.spe_z_prob, lines.spe_line_flux_gf, lines.spe_line_flux_err_gf \
165-
FROM {table_mer} AS mer \
166-
JOIN {table_lines} AS lines \
167-
ON mer.object_id = lines.object_id \
168-
JOIN {table_galaxy_candidates} AS galaxy \
169-
ON lines.object_id = galaxy.object_id AND lines.spe_rank = galaxy.spe_rank \
170-
WHERE lines.spe_line_snr_gf >5 \
171-
AND lines.spe_line_name = 'Halpha' \
172-
AND mer.tileid = {tileID} \
173-
AND galaxy.spe_z_prob > 0.99 \
174-
AND galaxy.spe_z BETWEEN 1.4 AND 1.6 \
175-
AND lines.spe_line_flux_gf > 2E-16 \
176-
ORDER BY lines.spe_line_snr_gf DESC \
177-
"
165+
adql_query = ("SELECT DISTINCT mer.object_id,mer.ra, mer.dec, mer.tileid, mer.flux_y_templfit, "
166+
"lines.spe_line_snr_gf,lines.spe_line_snr_di, lines.spe_line_name, lines.spe_line_central_wl_gf, "
167+
"lines.spe_line_ew_gf, galaxy.spe_z_err, galaxy.spe_z,galaxy.spe_z_prob, "
168+
"lines.spe_line_flux_gf, lines.spe_line_flux_err_gf "
169+
f"FROM {table_mer} AS mer "
170+
f"JOIN {table_lines} AS lines "
171+
"ON mer.object_id = lines.object_id "
172+
f"JOIN {table_galaxy_candidates} AS galaxy "
173+
"ON lines.object_id = galaxy.object_id AND lines.spe_rank = galaxy.spe_rank "
174+
"WHERE lines.spe_line_snr_gf >5 "
175+
"AND lines.spe_line_name = 'Halpha' "
176+
f"AND mer.tileid = {tileID} "
177+
"AND galaxy.spe_z_prob > 0.99 "
178+
"AND galaxy.spe_z BETWEEN 1.4 AND 1.6 "
179+
"AND lines.spe_line_flux_gf > 2E-16 "
180+
"ORDER BY lines.spe_line_snr_gf DESC ")
178181
179182
# Use TAP with this ADQL string
180-
result = Irsa.query_tap(adql)
181-
182-
# Convert table to pandas dataframe and drop duplicates
183-
result_table = result.to_qtable()
183+
result_table = Irsa.query_tap(adql_query).to_qtable()
184184
185185
result_table['spe_line_flux_gf'].info.format = ".8e" # Scientific notation with 8 decimal places
186186
result_table['spe_line_flux_err_gf'].info.format = ".8e"
@@ -202,34 +202,31 @@ obj_tab
202202
```{code-cell} ipython3
203203
adql_object = f"SELECT * FROM {table_1dspectra} WHERE objectid = {obj_id}"
204204
205-
result2 = Irsa.query_tap(adql_object)
206-
df2 = result2.to_table().to_pandas()
207-
df2
205+
result_table2 = Irsa.query_tap(adql_object).to_qtable()
208206
```
209207

210208
### The following steps to read in the spectrum follows the 3_Euclid_intro_1D_spectra notebook.
211209

212210
This involves reading in the spectrum without readin in the full FITS file, just pulling the extension we want.
213211

214212
```{code-cell} ipython3
215-
file_uri = urllib.parse.urljoin(Irsa.tap_url, result2['uri'][0])
213+
file_uri = urllib.parse.urljoin(Irsa.tap_url, result_table2['uri'][0])
216214
file_uri
217215
```
218216

219217
```{code-cell} ipython3
220218
with fits.open(file_uri) as hdul:
221-
hdu = hdul[df2['hdu'].iloc[0]]
222-
dat = Table.read(hdu, format='fits', hdu=1)
223-
df_obj_irsa = dat.to_pandas()
219+
spectrum = QTable.read(hdul[result_table2['hdu'][0]], format='fits')
220+
spec_header = hdul[result_table2['hdu'][0]].header
224221
```
225222

226223
### Now the data are read in, plot the spectrum with the H-alpha line labeled
227224

228225
Divide by 10000 to convert from Angstrom to micron
229226

230227
```{code-cell} ipython3
231-
wavelengths = obj_tab['spe_line_central_wl_gf']/10000.
232-
line_names = obj_tab['spe_line_name']
228+
wavelengths = object_spectra['spe_line_central_wl_gf']/10000.
229+
line_names = object_spectra['spe_line_name']
233230
snr_gf = obj_tab['spe_line_snr_gf']
234231
235232
plt.plot(df_obj_irsa['WAVELENGTH']/10000., df_obj_irsa['SIGNAL'])

0 commit comments

Comments
 (0)