Bayesian Photometric Redshift Estimation: The code

Prerequisites

  •  Python 2.2.0 or higher .  If you have relatively new Linux distribution it is probably already installed. Versions later than 2.0 (or even earlier) will probably work, you can try to run bpz with them before updating your python.

  • If you don't have Python installed, well... you should!, even if you are not going to use bpz.  Python is the most elegant, productive  and easy to learn language around.
     
  • The  Numerical Python package (Numpy). This is a free (and fast) numerical extension to Python.
  •  
  • Optionally, you may want to install biggles, a python 2D scientific plotting package ,able to produce publication quality plots. This is not necessary to run the code, bpz will detect whether biggles is installed and will just skip the plotting if it is not.
  • Installation

  • gzip and untar the file bpz.v.tgz , where v stands for the version number. It will create a directory called bpz.v  which contains the code and subdirectories with the filters, SED, etc.
  • Set the enviromental variable BPZPATH to the full path of the directory bpz.v
  • Define the alias bpz='python $BPZPATH/bpz.py'
  • Testing

    Go to bpz.v and type "python test.py" .
    This should do three things:

    Running the code

    A fundamental law of photo-z estimation is the GIGO principle:  "Garbage in, garbage out". This means that the results that you obtain from bpz will be, at best, as good as  your photometry. So before you run the code, make sure that your photometric calibration is right.

    Then you have to follow these steps:

    1. Define your template library

    If you are happy to use the CWWSB template set described in Benitez 2000 you can skip this step and go to step 2.

     If not, put the templates that you want to use in the bpz.v/SED directory as individual files containing the wavelenght in \AA,  and f_\lambda in arbitrary units e.g ( El_cww.sed ). Then create a file in that same directory or in the directory you are running bpz  containing  the names of the template set you want to use. ( e.g. CWWSB.list ) The template order within this file is important if you are going to use  a template-based prior afterwards. The name of this file is the parameter SPECTRA.

    2. Check that the filters you are going to use are in the FILTER directory.

    If not, put them there as individual files containing the wavelenght in \AA and the response. The first time you use a combination of filter and spectrum the program will calculate the AB observed fluxes as a function of redshift and keep them on the directory AB. This usually takes a while, but it is done only once (unless you change your templates and want to recalculate the AB fluxes, then use the NEW_AB option, see below)

    You can generate the total filter throughputs and copy them to the  subdirectory FILTER using the program gen_trans.py:

    3. Create a file containing the photometric catalog.

    The photometry has to be in magnitudes or in "magnitude" fluxes, i.e. f=10^(-.4m)  There are a few conventions to follow:

    The program will crash by design if you include negative magnitudes, fluxes or errors in the photometric catalog, except the -99.0 flag, so check it thoroughly before running it through bpz.py

    4. Create a *.columns file describing what the photometric catalog contains. 

    An example can be found in  hdfn_z.columns If the photometric catalog name is 'name.cat' the default name for this file will be 'name.columns'. This default can be changed using a command line option (see below)

    The first lines of this associated file, or "filter" lines contain the following columns:

    1.  The name of the filters (e.g. if the filter response in the directory FILTERS is called F300W_WFPC2.res, you have to write F300W_WFPC2 )
    2. The columns containing the flux and the errors (e.g. 2,9)
    3. The calibration system for the photometry, 'AB' or `Vega'. Even if the input is 'Vega',  bpz will convert all the photometry to AB fluxes using its synthetic photometry library before performing the comparison with the redshifted templates. 
    4. The estimated uncertainty  in the calibration zero point. This  will be added quadratically to the photometric errors of the individual objects. 
    5. A zero point magnitude offset (in case one wants to change the normalization of some filter without having to modify the photometric catalog)

    After the 'filter' lines we can include the column number of several columns designating columns in the photometric catalog file, can be included.  These columns will also be printed in the output file.  

  • ID corresponds to the object identification
  • M_0 corresponds to the magnitude used for the prior determination. The presence of this parameter serves as a trigger to the use of an magnitude-based prior.
  • OTHER corresponds to one or several columns from the input file which are just copied over to the output file. They are treated as strings and added at the end of the file.
  • 5. Run the program:

             bpz.py my_data.cat [-P my_data.pars] [-PAR1 value -PAR2 value1,value2,value3 ...]

    Command line options:
    -P  my_data.pars  This option allows to update the parameter information using the
      contents of a file. This file must have the following format:
        PAR1   value                                 # comment about parameter 1
        PAR2   value1,value2,value3     # comment about parameter 2

    Also any of the parameters in bpz.py can be modified in the command line.
    Their default values (assuming that the photometric file is called my_data.cat)
    are shown next to the parameter name

    -COLUMNS my_data.colums
                              The file containing the descriptions of the columns in the input photometry file.

    -OUTPUT  my_data.bpz
                             The file containing the photo-z.

    -SPECTRA CWWSB.list
                            The file containing the list of templates. The default is CWWSB.list (The four Coleman, Wu
                            and Weedman types, plus two Kinney et al. 1996 starbursts)

    -PRIOR  hdfn
                      The default value is the prior derived from the HDFN and the CFRS (Benitez 2000)
                      Other possible value is 'flat' or 'none'. No prior is used in that case.

    -DZ 0.01
         Resolution of the redshift grid. The intervals are logarithmic, (1+z)*dz

    -ZMIN 0.01
          Minimum redshift

    -ZMAX  6.5
           Maximum redshift

    -MAG yes
            The data in the photometric catalogs are interpreted as magnitudes by default.
             If not, they are treated as fluxes.

    -NEW_AB no
             If 'yes', recalculates the AB files even if they are already present in the AB directory. It is useful
             if the filter shape or spectral templates have been modified but  their names are the  same as before.

    -EXCLUDE none
          If its value were FILTER_NAME1,FILTER_NAME2 it would
           exclude the data corresponding to those filters from the estimation.

    -CHECK yes
            This option estimates the average of the ratio between the observed fluxes and
             the model fluxes corresponding to the best fit. It is aproximately 1. for the HDF-N spectroscopic
             sample, which shows that the CWW+SB templates agree reasonably well with the HDFN
             spectroscopic data. It may be useful to detect calibration errors either in the photometry or in the
            template set. If spectroscopic redshifts are present in the input catalog, it also prints the comparison
           between the observed colors and the expected ones from the templates.

    -ZC 1.,2.  -FC .2,.4  Adds gaussian 'spikes' to the prior at the redshifts indicated in ZC. The parameter
               FC represents the fraction of the total number of galaxies expected to be in each cluster

    -VERBOSE yes
         If set to 'no', stops bpz from printing to the screen the results of the photo-z estimation.
        ( it doesn't affect the output from CHECK and PROBS)

    -INTERP 0
      This introduces n points of interpolation between the templates in the color space. It is not optimized yet
      and considerably slows down the program, but it seems to yield more precise results, specially at low redshift,
      even with values as low as INTERP 2.

    -ODDS 0.95
      This number is used to define the redshift confidence limits and also the interval over which
      we integrate to find the empirical odds (the interval is such that a gaussian probability distribution
      will contain e.g. 95% of the probability)

    -PROBS no
            If its value is different from 'no',  it will writes the final bayesian probability and all the individual type
            likelihoods to a file named as the parameter value.  It may increment the running time by a factor 3 or
            more, and generates a huge file, so use with caution. Lots of improvement to do here.

    -PROBS_LITE no
           This saves the final redshift probability distribution for each galaxy, which contains all the information about
    the photometric redshift.

    -GET_Z yes
         If this option is 'no', bpz does not estimate new photometric redshifts. It performs the rest of its functions
         like plotting, generating AB files, etc. but instead of estimating new photo-z, it reads them from
        an already existing OUTPUT file.

    -INTERACTIVE no 

       If on, bpz will query the user about plot options, etc. 

    -PLOTS no

       If on, it will produce some useful plots.