BDNYC Database Documentation

Now that the database infrastructure is more complete, I have generated the Sphinx documentation for all the database code. You can find it at

bdnyc.org/database

To have a full understanding of the database, you need both the Sphinx documentation and the Python Database Structure diagram, which is shared in Google Drive.

show_data: Quickly See What Data Exists in the BDNYC Database

In the BDNYC database code, I added a new method called show_data. Its purpose is to show you all and any kind of data that exists in the database for a specific target. The method is part of the BDNYC.py module, under the BDNYCData class.

The only input it needs is the U-number of the target. So, for example, if you want to know what data exists for target U10276, type:

> bdnyc.show_data(‘U10276’)

assuming that bdnyc is where you loaded the database using the pickle.load(f) command. The method will automatically check whether the target exists in the database, and if it does, it will print out a list of rows, each one describing a piece of data for the target in question. For the case of U10276, you will get the following in your terminal:

[‘unum’, ‘U10276’]
[‘index’, 74]
[‘name’, ‘sd0423’]
[‘sptype’, ‘T0’]
[‘ra’, ’04 23 48.6′]
[‘dec’, ‘-04 14 04.0’]
[‘standard’, ‘Yes’]
[0, ‘opt’, ‘med’, ‘CT 4m’, ‘2002jan25’]
[1, ‘nir’, ‘high’, ‘NIRSPEC’, ‘2006jan10’]
[2, ‘nir’, ‘high’, ‘NIRSPEC’, ‘2005dec11’]
[3, ‘nir’, ‘high’, ‘NIRSPEC’, ‘2001oct09’]
[4, ‘nir’, ‘med’, ‘NIRSPEC’, ‘0000xxx00’]
[5, ‘nir’, ‘low’, ‘IRTF SpeX Prism’, ‘2003sep17’]
[6, ‘nir’, ‘phot’, ‘2MASS’, [‘H’, ‘K’, ‘J’]]

From this, you can see that there are five spectra in the database: one optical-medium resolution, three nir-high resolution, one nir-medium resolution, and one nir-low resolution. You also see that there is nir-photometry for this target for the J, H, and K bands.

Note that the second row gives you the index corresponding to this target in the database. This number is the same result that you would get if you used the method bdnyc.match_unum(unum).

One last thing, to make this output code-friendly, I added the option to give you the output as a list object —as opposed to just a terminal print-out— which you can then easily manipulate with your code. To do this, type:

> x = bdnyc.show_data(‘U10276’, dump=True)

And x will hold the list result shown above. Once you can see what data exists for a target, it becomes much easier to access the actual spectrum or photometry you want.

This is the complete method:

    bdnyc.show_data(unum, dump=True)

New Database Method: res_initializer()

While working on the database dump (putting all of Kelle’s data into the Python database), I wrote a new method to take care of 2 issues with the database. In it’s old state, we needed (and wanted) to have the resolution level of the structures (i.e. high, med, low, phot) initialized as empty dictionaries if no data exists. This was not the case, however. As a result, the dateList() method was no longer working, since it required this to be true. Also, having this level initialized makes the database dump (and any future dumps) easier. And so, this new method was born.

Remember, a method is a function attached to an object. In this case, the method is attached to our Python database. If you named the database bdnyc, then you can call the res_initializer() method with the command:
>>> bdnyc.res_initializer()              # No inputs are needed.

It’s simple–the method loops through all existing targets and adds these empty dictionaries if necessary. I also changed the addTarget method so that when you add a new target to the database, it will automatically run this method for you, unless you explicitly set init=False (if you’re adding multiple targets to the database, you can just wait until you put all of them in before running the initializer method).

If you are worried that I changed how you input new data into the database, don’t be: you can still follow Vivienne’s instructions on how to add data to the database, no changes necessary.

Sherpa Part Three (Spectral Measurements)

Over the course of this semester, I have been utilizing the program Sherpa, written in Python, to model Potassium (K) absorption lines and make three specific measurements. Specifically, these measurements are the equivalent width, full width half maximum, and line-to-continuum flux ratio. In this entry, I will show the commands I utilized to make these measurements.

When I use Sherpa, I import it as a python module as such:
>>> from sherpa.astro.ui import *    # All Sherpa functions are found here

If you followed my post on modeling with Sherpa, then I have used 3 separate model components to create my model. These include the power law (pl), the lorentzian (l1) and the gaussian (g1). The power law describes the continuum, and the three together represent the data.

To measure the equivalent width of the K line, I use Sherpa’s eqwidth() function. The syntax is:
>>>eqwidth(continuum, continuum+line, lo=lower wl limit, hi=upper wl limit, id=data ID #)
Using my data for a K line whose line goes from 1.252µ to 1.253µ, I measure the eq. width like so:
>>> eqwidth(pl, pl+l1+g1, lo=1.252, hi=1.253, id=1)

The output will be the equivalent width value, in microns. Nice and simple. The hard part was constructing the models and eyeballing the lower/upper limits for the eq. width measurement. All of this I did manually by eye.

The line-to-continuum flux ratio and full width half maximum measurements are harder. They require locating the actual arrays of values for the various models. In order to access the model’s x and y values (continuum+lorentzian+gaussian), you do:
>>> modelArrays = get_model_plot(1)     # The 1 is the ID number
# modelArrays.x  –> array of x values
# modelArrays.y  –> array of y values

To access the y values for the continuum only (pl), you have to do:
>>> key = pl._cache.keys()[0]
>>>  cont_yvals = pl._cache[key]
# This may not work if some of the values in pl.__dict__ are not set properly. To be honest, # I’m not fully sure which ones may screw this up, but for me, I have:
>>> pl._NoNewAttributesAfterInit__initialized = True
>>> pl._use_caching = True
# For me, since I use a power law with an exponent of 0, all of the y values are the same. So I can access this value instead by doing:
>>> cont_yval = pl.ampl      # The continuum is a constant value: the amplitude

Armed with the model’s x and y value arrays, and the continuum’s y value array (or value), we can make the measurements. To calculate the line-to-continuum flux ratio, we must first find the minimum (maximum for emission lines) value of the model, and then compare it with the corresponding continuum flux value:
>>> F_model = numpy.min(modelArrays.y)        # Minimum of the model
>>> min_ind = numpy.where(modelArrays.y == F_model)[0][0]   # min’s array index
>>> F_cont = cont_yvals[min_ind]                      # Continuum value
>>> depth = 1.0 – (F_model/F_cont)                  # The line-to-continuum flux ratio!

The full width half maximum was a little tougher. The algorithm I used is as follows:
>>> F_halfmax = (F_cont + F_max) / 2.0            # Flux at the half maximum
>>> indices = numpy.where(modelArrays.y <= F_halfmax)[0]
>>> wls = modelArrays.x[indices]
>>> fwhm = wls[-1] – wls[0]

The problem with this algorithm is that sometimes there are pockets of spectrum that are equal to or lower than the half max, which get into the wls array. For those, I had to go through them manually and fix the wls array.

That’s how I measured the equivalent width, line-to-continuum flux ratio, and full width half maximum using Sherpa. If you have any questions or know of a better (or different) way of calculating these values, feel free to leave a comment or email me at danfeldman90@gmail.com

Editing astrotools

It is important to maintain some standards in astrotools. The goal is to keep it readable and stable. With than in mind, I shall provide some coding guidelines and then describe the structure of astrotools in detail to allow for its easy editing.

Actually, I will let Guido van Rossum (the father of Python) provide you with all the necessary coding guidelines, from naming conventions to code lay-out. You can find them in Guido’s Style Guide for Python Code.

Let me clear out an important aspect about Python modules. A module is a collection of reusable pieces of code, such as functions and classes. A piece of code in a module can use another piece within the same module or in another module, or it can be completely independent. The implications are that you can add a function:

  • that is completely self-contained and that does not interfere with anything else in astrotools
  • that makes use of other astrotools functions/classes
  • that makes use of other Python modules (e.g. numpy), in which case you need a command line to invoke such module (more on this below).

The order of things in astrotools

The module astrotools is broken down into five sections. From top to bottom in the code, they are described below.

I) Documentation: Contains general information about the contents of the module. This type of non-executable text (a.k.a. docstring) must always start and end with three apostrophes (which I like to call Melchior, Caspar, & Balthazar). This tells Python that the stuff within them is a string that can be called by the end user using Python help.

II) External Modules: Contains the Python commands used to invoke external Python modules —the famous “import” commands. Having them all in one location as opposed to inside the functions/classes where they are needed helps to avoid repetition (when more than one function/class needs to invoke the same external module) and to clearly see the external modules needed to use astrotools.

III) Public Functions: Organized alphabetically, these functions are meant to be accessible to end users. These functions have their own docstrings, surrounded by the three apostrophes, just like in Section I.

IV) Non-Public functions: Organized alphabetically, these functions are only used by other functions/classes within astrotools. As such, end users have no business accessing these. Python convention says non-public names should be preceded by double underscore. Technically, anyone can still access them, but Python code of conduct tells end users not to do it. These functions do not need docstrings, only commented (#) intro lines to explain their purpose and where they are used.

V) Public Classes: Organized alphabetically, these classes are meant to be accessible to end users. Just like with Public Functions, Public Classes have their own docstrings.

How to add a function/class to astrotools

  1. Download the latest version of astrotools from the GitHub repository into your computer.
  2. If your function/class needs to import an external module, check if it is already imported in Section II of astrotools. If not, add the command line to do so. Take as much advantage as you can of the modules already imported in astrotools before adding a new one. More external modules needed by astrotools means more requirements that end users will have to meet before having a fully functional astrotools module.
  3. Add your function/class in the appropriate location (alphabetically) within the relevant section (III or V) of astrotools.
  4. Your function/class MUST have a docstring, and it should follow reStructuredText syntax. No panic! It is very simple. For instance, if you want a word in italics, you write: *word*. Consult the reStructuredText User Reference for more information. In terms of format and contents, follow the example of docstrings of existing astrotools functions. Include in the docstring a description of all input parameters in your function/class, including their formats. Remember, the point of documenting your code is to allow others to easily use it and edit it in case you die. This is why I like to call the docstring “the death string”.
  5. If your function/class comes with a sub-function, that is, a side function that you wrote that is used by your main function/class, this means you are a good (object-oriented) programmer! The place to put it is Section IV of astrotools. Include in it intro comments about its purpose and where it is used. If you think that this sub-function could also be useful to end users, by all means add it in Section III instead. Just remember to write a docstring for it.
  6. Test the function/class that you added. Test extreme cases, test wrong input formats, test, test, test!
  7. Update astrotools.py in the GitHub repository with your new edited version.
  8. Inform the person that maintains the public documentation for astrotools so that it gets updated in bdnyc.org/astrotools.

Regularizing Data

I have been trying to cross correlate two spectra with the intent of finding the radial velocity of one relative to the other.  I’ve been working on a Python code to do this for some time, but was having complications.  Although there is clearly a shift between the two spectra, the numpy cross correlation function that the code uses kept coming up with a shift of zero.  I recently came across the concept of “regularizing” data on a python help website.  Someone was having a similar problem (coming up with a cross correlation result of a zero shift), and it was pointed out that they weren’t regularizing their data.  To regularize data is to subtract off the mean, and divide by the standard deviation.  I tried doing this in my code, and it actually started to work, and give me very nice answers!  The problem is that I don’t particularly understand why this makes the cross correlation work.

Is anyone familiar with the concept of regularizing data? Or has anyone had a similar problem regarding cross correlations?