Here are my notes from Jake Vanderplas' talk at RuPy 2013. I'm certainly no expert on astronomy, so caveat emptor. :)
- LSST is a project for exploring the universe.
- Astronomy has a history of looking up to the sky. Edwin Hubble in the early 20th century learned many things with a big telescope. We've come a long way from there. The Hubble space telescope is in Low Earth Orbit imaging the universe.
- Today astronomy has moved to "Survey Astronomy". The Sloan digital sky survey was an automated survey that ran for several years, mapping in great detail the locations and distances to distant galaxies.
- The data volumes are growing.
- The LSST is the Large Synoptic Survey Telescope. It's an extremely large telescope, that can scan the entire Southern night sky and provide some incredible pictures, into what's really a digital movie.
- 8.5 meter primary mirror
- A 3000 megapixel CCD camera, field of view of 9.5deg2, largest digital camera in the world!
- Produces a 10-year, full-sky digital color movie.
- 30 000 GB of data per night.
- Need to do real time processing, because they're looking for changes in data, so they can get alerts and get other telescopes to look at the data.
- Lots of institutions involved.
- How do we prepare the astronomy community for this deluge of data?
- With the first data from LSST, it took about a year for people to figure out how to use it and to make papers based on it.
- If you can't work with the data, you're going to fall behind and not be able to work with that.
- The LSST image simulation simulates a virtual universe with stars as point sources, and the photons they emit, and how the photons deflect as they go through the atmosphere and hit the mirrors. Also simulates the physics of the CCD and how it behave when hit by photons.
- Builds up very realistic images of what the LSST is going to produce once it gets started.
- Can also use the simulation to design the surveys.
- A key component is called Difference Imaging. Two exposures from consecutive nights, and compare/subtract to see changes, such as a supernova. In LSST there are going ot be 2M of these kinds of events per night, of which 500K will be random. Asteroids, stochastic variability, but several thousand will be "interesting". Supernovae, planets crossing in front of stars, etc.
- A real time alert stream will, within 60s from exposure, categorise these events to a "Twitter stream of supernovae". You can go and follow up.
- Dark energy was discovered by looking at supernovae, leading to the discovery that the universe is expanding at an increasing rate.
- Cosmological information by looking at the geometry of the universe you can get information about "Baryon Oscillations", traces of primordial fluctuations in the universe.
- Gives information about the transient universe by discovering gamma ray bursts, quasars, stellar flares, asteroids and comets.
- Light echoes give out information about supernovae that happend way in the past.
- Image differencing is hard: You need to rotate and align, detect sources, account for point spread ("smearing"), de-convolve and subtract. An interesting problem.
- The data comes in through "data acquisition". Goes into a database with a model of what the sky looks like. Each image is compared to that model.
- A big pipeline with lots of intensinve computation. Need to do it in real time.
- The core of the software is a C++ object model, written over the lat 10 years.
- Must be useful for astronomers and future surveys. For this, a SWIG model that mirrors the C++ object model. Some problems, but it's working out very well.
- Astronomers don't like C++ code, they like their interpreted, dynamic languages.
- A standard library for doing astronomical calculations in the future.
- Python is easy to use - even for scientists!
- Open source and cross platform
- IDL (interactive data language) also used a lot in sciece. but it's proprietary. Running it on 100 cores requires 100 site licenses. A parallel job may take up all your licenses, stopping everyone else's work.
- Python provides good tooling for interacting with compiled languages.
- Good scientific packages: NumPy, SciPy, Matplotlib, IPython. Lots of stuff for machine learning, astronomy, network analysis,... Strong ecosystem.
- IPython Notebook: A ways of sharing code, graphics, and analysis all in one package where it can be executed and reproduced easily.
- AstroML: A Python machine learning package for Astronomy. 200+ examples of real analysis on real data already.
- -> Python is becoming the tool of choice for much data-driven research.
- Scientific community does struggle with packaging. Packages that rely on compiled C-code (NumPy, SciPy) can't rely on PIP.
Reproducability in Science & The IPython Notebook
- Reproducability: When things are more and more data driven, becomes more important.
- An economic paper called "Growth in a Time of Debt" could not be reproduced based on existing data. Their original Excel spreadsheet included an error, changing the result.
- With the IPython Notebook, you can reproduce everything, rerun all code cell by cell. If people adopted a platform like this and published results in this way, we would be in a much better place in many scientific fields.
Know Your AngularJS Inside Out
Build Your Own AngularJS helps you understand everything there is to understand about AngularJS (1.x). By creating your very own implementation of AngularJS piece by piece, you gain deep insight into what makes this framework tick. Say goodbye to fixing problems by trial and error and hello to reasoning your way through them