-
Notifications
You must be signed in to change notification settings - Fork 11
Overview of the core Python language
There are many excellent Python tutorials on the web. We will not attempt to provide another one here. Some of our favorites are:
Python is a general-purpose programming language. It is used for many purposes and was not conceived of as a language for data management and data analysis (or for any other single purpose).
Python implementations exist for many platforms and hardware configurations. The core language and libraries behave in a highly consistent way across a variety of platforms (Windows, Linux, Max, many others).
Matlab and R are "domain specific languages" (DSL's). They were primarily designed for one purpose. Scientists who like to think of computer programs simply as sequences of calculations seem to prefer simple languages that translate mathematical logic into computer code in a straightforward, declarative manner. Matlab and Julia reflect this perspective. On the other hand, people who have learned some basic ideas from modern computer science often prefer a language that incorporates capabilities such as more advanced data structures (e.g. associative arrays), functions as first-class objects, closures, and pass-by-reference semantics (most or all of this is available in R and Matlab in some way, but perhaps not very naturally or as the default).
Python itself has rather poor numerical performance compared to compiled languages like C and Fortran. However Python has a large number of excellent libraries that allow very good (even excellent) numerical performance on many problems.
Due to the huge community of Python users, the core language and interpreter have been heavily optimized for performance. Interpreted Python is generally faster than interpreted R. However, in any interpreted language, complicated operations will be somewhat slow. Most generic operations in Python (like sorting a list) are implemented in C. These generic operations will take roughly the same time to execute in any well-implemented language (e.g. R, Matlab, Python, C, Java).
It is possible to write C extensions to Python. This is made particularly easy by using a tool called "Cython", which is not covered in this workshop. However, most users will rarely if ever need to use Cython. Due to the availability of excellent Python libraries such as Numpy, it is often possible to express complex calculations in such a way that most of the work takes place in the library (where time-critical components will already have been written in C).
Python is not (and was not intended to be) an exotic or revolutionary language. It should generally be quite familiar and intuitive to anyone familiar with generic "pseudo-code". It has a few distinguishing features, most notably, the use of indentation rather than braces to define code blocks.
The Python community is currently progressing through a transition from the "2 series" Python implementations to the "3 series" Python implementations. Changes in the series number (1, 2, and 3 so far) indicate a major break with backward compatibility. Python 3 scripts may not run in Python 2, and Python 2 scripts may not run in Python 3. Nearly all libraries needed to be substantially modified to work in Python 3, and this process is now largely complete. Version numbers within a series (e.g. 3.1 versus 3.2) will generally be compatible, except for bug fixes and introductions of new features.
There are many small changes and a few large changes from Python 2 to Python 3. These changes are generally of little consequence to most users of Python for scientific purposes.
Many general-purpose Python users switched from version 2 to version 3 long ago. Users of Python for scientific research have been much slower to switch to Python 3, since certain libraries were not available in Python 3 versions until recently. Scientific Python users are beginning to switch to Python 3 in large numbers.
Python 2 development is frozen at version 2.7. Some bug fixes will be backported, but all interesting future development and improvement will take place in the Python 3 series.