Issue #2730 remove extracted numpy directories and files

Change-Id: Ib1f47007c8820e89f75dadc25682aff7f356e99d

Former-commit-id: 9b73789b20 [formerly 329f95c875e55790821801fb0f7d70c26adc54f2]
Former-commit-id: 0e7f4d714d
This commit is contained in:
Brad Gonzales 2014-04-10 15:04:02 -05:00
parent 74a7189143
commit 99d4fc15ef
661 changed files with 0 additions and 323750 deletions

View file

@ -1,59 +0,0 @@
X.flat returns an indexable 1-D iterator (mostly similar to an array
but always 1-d) --- only has .copy and .__array__ attributes of an array!!!
.typecode() --> .dtype.char
.iscontiguous() --> .flags['CONTIGUOUS'] or .flags.contiguous
.byteswapped() -> .byteswap()
.itemsize() -> .itemsize
.toscalar() -> .item()
If you used typecode characters:
'c' -> 'S1' or 'c'
'b' -> 'B'
'1' -> 'b'
's' -> 'h'
'w' -> 'H'
'u' -> 'I'
C -level
some API calls that used to take PyObject * now take PyArrayObject *
(this should only cause warnings during compile and not actual problems).
PyArray_Take
These commands now return a buffer that must be freed once it is used
using PyMemData_FREE(ptr);
a->descr->zero --> PyArray_Zero(a)
a->descr->one --> PyArray_One(a)
Numeric/arrayobject.h --> numpy/oldnumeric.h
# These will actually work and are defines for PyArray_BYTE,
# but you really should change it in your code
PyArray_CHAR --> PyArray_CHAR
(or PyArray_STRING which is more flexible)
PyArray_SBYTE --> PyArray_BYTE
Any uses of character codes will need adjusting....
use PyArray_XXXLTR where XXX is the name of the type.
If you used function pointers directly (why did you do that?),
the arguments have changed. Everything that was an int is now an intp.
Also, arrayobjects should be passed in at the end.
a->descr->cast[i](fromdata, fromstep, todata, tostep, n)
a->descr->cast[i](fromdata, todata, n, PyArrayObject *in, PyArrayObject *out)
anything but single-stepping is not supported by this function
use the PyArray_CastXXXX functions.

View file

@ -1,19 +0,0 @@
Thank you for your willingness to help make NumPy the best array system
available.
We have a few simple rules:
* try hard to keep the SVN repository in a buildable state and to not
indiscriminately muck with what others have contributed.
* Simple changes (including bug fixes) and obvious improvements are
always welcome. Changes that fundamentally change behavior need
discussion on numpy-discussions@scipy.org before anything is
done.
* Please add meaningful comments when you check changes in. These
comments form the basis of the change-log.
* Add unit tests to excercise new code, and regression tests
whenever you fix a bug.

View file

@ -1,139 +0,0 @@
.. -*- rest -*-
.. vim:syntax=rest
.. NB! Keep this document a valid restructured document.
Building and installing NumPy
+++++++++++++++++++++++++++++
:Authors: Numpy Developers <numpy-discussion@scipy.org>
:Discussions to: numpy-discussion@scipy.org
.. Contents::
PREREQUISITES
=============
Building NumPy requires the following software installed:
1) Python__ 2.4.x or newer
On Debian and derivative (Ubuntu): python python-dev
On Windows: the official python installer on Python__ is enough
Make sure that the Python package distutils is installed before
continuing. For example, in Debian GNU/Linux, distutils is included
in the python-dev package.
Python must also be compiled with the zlib module enabled.
2) nose__ (pptional) 0.10.3 or later
This is required for testing numpy, but not for using it.
Python__ http://www.python.org
nose__ http://somethingaboutorange.com/mrl/projects/nose/
Fortran ABI mismatch
====================
The two most popular open source fortran compilers are g77 and gfortran.
Unfortunately, they are not ABI compatible, which means that concretely you
should avoid mixing libraries built with one with another. In particular, if
your blas/lapack/atlas is built with g77, you *must* use g77 when building
numpy and scipy; on the contrary, if your atlas is built with gfortran, you
*must* build numpy/scipy with gfortran.
Choosing the fortran compiler
-----------------------------
To build with g77:
python setup.py build --fcompiler=gnu
To build with gfortran:
python setup.py build --fcompiler=gnu95
How to check the ABI of blas/lapack/atlas
-----------------------------------------
One relatively simple and reliable way to check for the compiler used to build
a library is to use ldd on the library. If libg2c.so is a dependency, this
means that g77 has been used. If libgfortran.so is a a dependency, gfortran has
been used. If both are dependencies, this means both have been used, which is
almost always a very bad idea.
Building with ATLAS support
===========================
Ubuntu 8.10 (Intrepid)
----------------------
You can install the necessary packages for optimized ATLAS with this command:
sudo apt-get install libatlas-base-dev
If you have a recent CPU with SIMD suppport (SSE, SSE2, etc...), you should
also install the corresponding package for optimal performances. For example,
for SSE2:
sudo apt-get install libatlas3gf-sse2
*NOTE*: if you build your own atlas, Intrepid changed its default fortran
compiler to gfortran. So you should rebuild everything from scratch, including
lapack, to use it on Intrepid.
Ubuntu 8.04 and lower
---------------------
You can install the necessary packages for optimized ATLAS with this command:
sudo apt-get install atlas3-base-dev
If you have a recent CPU with SIMD suppport (SSE, SSE2, etc...), you should
also install the corresponding package for optimal performances. For example,
for SSE2:
sudo apt-get install atlas3-sse2
Windows 64 bits notes
=====================
Note: only AMD64 is supported (IA64 is not) - AMD64 is the version most people
want.
Free compilers (mingw-w64)
--------------------------
http://mingw-w64.sourceforge.net/
To use the free compilers (mingw-w64), you need to build your own toolchain, as
the mingw project only distribute cross-compilers (cross-compilation is not
supported by numpy). Since this toolchain is still being worked on, serious
compilers bugs can be expected. binutil 2.19 + gcc 4.3.3 + mingw-w64 runtime
gives you a working C compiler (but the C++ is broken). gcc 4.4 will hopefully
be able to run natively.
This is the only tested way to get a numpy with a FULL blas/lapack (scipy does
not work because of C++).
MS compilers
------------
If you are familiar with MS tools, that's obviously the easiest path, and the
compilers are hopefully more mature (although in my experience, they are quite
fragile, and often segfault on invalid C code). The main drawback is that no
fortran compiler + MS compiler combination has been tested - mingw-w64 gfortran
+ MS compiler does not work at all (it is unclear whether it ever will).
For python 2.5, you need VS 2005 (MS compiler version 14) targetting
AMD64 bits, or the Platform SDK v6.0 or below (which gives command
line versions of 64 bits target compilers). The PSDK is free.
For python 2.6, you need VS 2008. The freely available version does not
contains 64 bits compilers (you also need the PSDK, v6.1).
It is *crucial* to use the right version: python 2.5 -> version 14, python 2.6,
version 15. You can check the compiler version with cl.exe /?. Note also that
for python 2.5, 64 bits and 32 bits versions use a different compiler version.

View file

@ -1,30 +0,0 @@
Copyright (c) 2005-2009, NumPy Developers.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
* Neither the name of the NumPy Developers nor the names of any
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View file

@ -1,20 +0,0 @@
#
# Use .add_data_files and .add_data_dir methods in a appropriate
# setup.py files to include non-python files such as documentation,
# data, etc files to distribution. Avoid using MANIFEST.in for that.
#
include MANIFEST.in
include LICENSE.txt
include setupscons.py
include setupsconsegg.py
include setupegg.py
# Adding scons build related files not found by distutils
recursive-include numpy/core/code_generators *.py *.txt
recursive-include numpy/core *.in *.h
recursive-include numpy SConstruct SConscript
# Add documentation: we don't use add_data_dir since we do not want to include
# this at installation, only for sdist-generated tarballs
include doc/Makefile doc/postprocess.py
recursive-include doc/release *
recursive-include doc/source *
recursive-include doc/sphinxext *

View file

@ -1,37 +0,0 @@
Metadata-Version: 1.0
Name: numpy
Version: 1.5.0b1
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://numpy.scipy.org
Author: NumPy Developers
Author-email: numpy-discussion@scipy.org
License: BSD
Download-URL: http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103
Description: NumPy is a general-purpose array-processing package designed to
efficiently manipulate large multi-dimensional arrays of arbitrary
records without sacrificing too much speed for small multi-dimensional
arrays. NumPy is built on the Numeric code base and adds features
introduced by numarray as well as an extended C-API and the ability to
create arrays of arbitrary type which also makes NumPy suitable for
interfacing with general-purpose data-base applications.
There are also basic facilities for discrete fourier transform,
basic linear algebra and random number generation.
Platform: Windows
Platform: Linux
Platform: Solaris
Platform: Mac OS-X
Platform: Unix
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved
Classifier: Programming Language :: C
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development
Classifier: Topic :: Scientific/Engineering
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS

View file

@ -1,23 +0,0 @@
NumPy is the fundamental package needed for scientific computing with Python.
This package contains:
* a powerful N-dimensional array object
* sophisticated (broadcasting) functions
* tools for integrating C/C++ and Fortran code
* useful linear algebra, Fourier transform, and random number capabilities.
It derives from the old Numeric code base and can be used as a replacement for Numeric. It also adds the features introduced by numarray and can be used to replace numarray.
More information can be found at the website:
http://scipy.org/NumPy
After installation, tests can be run with:
python -c 'import numpy; numpy.test()'
The most current development version is always available from our
subversion repository:
http://svn.scipy.org/svn/numpy/trunk

View file

@ -1,62 +0,0 @@
Travis Oliphant for the NumPy core, the NumPy guide, various
bug-fixes and code contributions.
Paul Dubois, who implemented the original Masked Arrays.
Pearu Peterson for f2py, numpy.distutils and help with code
organization.
Robert Kern for mtrand, bug fixes, help with distutils, code
organization, strided tricks and much more.
Eric Jones for planning and code contributions.
Fernando Perez for code snippets, ideas, bugfixes, and testing.
Ed Schofield for matrix.py patches, bugfixes, testing, and docstrings.
Robert Cimrman for array set operations and numpy.distutils help.
John Hunter for code snippets from matplotlib.
Chris Hanley for help with records.py, testing, and bug fixes.
Travis Vaught for administration, community coordination and
marketing.
Joe Cooper, Jeff Strunk for administration.
Eric Firing for bugfixes.
Arnd Baecker for 64-bit testing.
David Cooke for many code improvements including the auto-generated C-API,
and optimizations.
Andrew Straw for help with the web-page, documentation, packaging and
testing.
Alexander Belopolsky (Sasha) for Masked array bug-fixes and tests,
rank-0 array improvements, scalar math help and other code additions.
Francesc Altet for unicode, work on nested record arrays, and bug-fixes.
Tim Hochberg for getting the build working on MSVC, optimization
improvements, and code review.
Charles (Chuck) Harris for the sorting code originally written for
Numarray and for improvements to polyfit, many bug fixes, delving
into the C code, release management, and documentation.
David Huard for histogram improvements including 2-D and d-D code and
other bug-fixes.
Stefan van der Walt for numerous bug-fixes, testing and documentation.
Albert Strasheim for documentation, bug-fixes, regression tests and
Valgrind expertise.
David Cournapeau for build support, doc-and-bug fixes, and code
contributions including fast_clipping.
Jarrod Millman for release management, community coordination, and code
clean up.
Chris Burns for work on memory mapped arrays and bug-fixes.
Pauli Virtanen for documentation, bug-fixes, lookfor and the
documentation editor.
A.M. Archibald for no-copy-reshape code, strided array tricks,
documentation and bug-fixes.
Pierre Gerard-Marchant for rewriting masked array functionality.
Roberto de Almeida for the buffered array iterator.
Alan McIntyre for updating the NumPy test framework to use nose, improve
the test coverage, and enhancing the test system documentation.
Joe Harrington for administering the 2008 Documentation Sprint.
NumPy is based on the Numeric (Jim Hugunin, Paul Dubois, Konrad
Hinsen, and David Ascher) and NumArray (Perry Greenfield, J Todd
Miller, Rick White and Paul Barrett) projects. We thank them for
paving the way ahead.
Institutions
------------
Enthought for providing resources and finances for development of NumPy.
UC Berkeley for providing travel money and hosting numerous sprints.
The University of Central Florida for funding the 2008 Documentation Marathon.
The University of Stellenbosch for hosting the buildbot.

View file

@ -1,165 +0,0 @@
# Makefile for Sphinx documentation
#
PYVER =
PYTHON = python$(PYVER)
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = LANG=C sphinx-build
PAPER =
NEED_AUTOSUMMARY = $(shell $(PYTHON) -c 'import sphinx; print sphinx.__version__ < "0.7" and "1" or ""')
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d build/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source
.PHONY: help clean html web pickle htmlhelp latex changes linkcheck \
dist dist-build
#------------------------------------------------------------------------------
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " pickle to make pickle files (usable by e.g. sphinx-web)"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " changes to make an overview over all changed/added/deprecated items"
@echo " linkcheck to check all external links for integrity"
@echo " dist PYVER=... to make a distribution-ready tree"
@echo " upload USER=... to upload results to docs.scipy.org"
clean:
-rm -rf build/* source/reference/generated
#------------------------------------------------------------------------------
# Automated generation of all documents
#------------------------------------------------------------------------------
# Build the current numpy version, and extract docs from it.
# We have to be careful of some issues:
#
# - Everything must be done using the same Python version
# - We must use eggs (otherwise they might override PYTHONPATH on import).
# - Different versions of easy_install install to different directories (!)
#
INSTALL_DIR = $(CURDIR)/build/inst-dist/
INSTALL_PPH = $(INSTALL_DIR)/lib/python$(PYVER)/site-packages:$(INSTALL_DIR)/local/lib/python$(PYVER)/site-packages:$(INSTALL_DIR)/lib/python$(PYVER)/dist-packages:$(INSTALL_DIR)/local/lib/python$(PYVER)/dist-packages
DIST_VARS=SPHINXBUILD="LANG=C PYTHONPATH=$(INSTALL_PPH) python$(PYVER) `which sphinx-build`" PYTHON="PYTHONPATH=$(INSTALL_PPH) python$(PYVER)" SPHINXOPTS="$(SPHINXOPTS)"
UPLOAD_TARGET = $(USER)@docs.scipy.org:/home/docserver/www-root/doc/numpy/
upload:
@test -e build/dist || { echo "make dist is required first"; exit 1; }
@test output-is-fine -nt build/dist || { \
echo "Review the output in build/dist, and do 'touch output-is-fine' before uploading."; exit 1; }
rsync -r -z --delete-after -p \
$(if $(shell test -f build/dist/numpy-ref.pdf && echo "y"),, \
--exclude '**-ref.pdf' --exclude '**-user.pdf') \
$(if $(shell test -f build/dist/numpy-chm.zip && echo "y"),, \
--exclude '**-chm.zip') \
build/dist/ $(UPLOAD_TARGET)
dist:
make $(DIST_VARS) real-dist
real-dist: dist-build html
test -d build/latex || make latex
make -C build/latex all-pdf
-test -d build/htmlhelp || make htmlhelp-build
-rm -rf build/dist
cp -r build/html build/dist
perl -pi -e 's#^\s*(<li><a href=".*?">NumPy.*?Manual.*?&raquo;</li>)#<li><a href="/">Numpy and Scipy Documentation</a> &raquo;</li>#;' build/dist/*.html build/dist/*/*.html build/dist/*/*/*.html
cd build/html && zip -9r ../dist/numpy-html.zip .
cp build/latex/numpy-*.pdf build/dist
-zip build/dist/numpy-chm.zip build/htmlhelp/numpy.chm
cd build/dist && tar czf ../dist.tar.gz *
chmod ug=rwX,o=rX -R build/dist
find build/dist -type d -print0 | xargs -0r chmod g+s
dist-build:
rm -f ../dist/*.egg
cd .. && $(PYTHON) setupegg.py bdist_egg
install -d $(subst :, ,$(INSTALL_PPH))
$(PYTHON) `which easy_install` --prefix=$(INSTALL_DIR) ../dist/*.egg
#------------------------------------------------------------------------------
# Basic Sphinx generation rules for different formats
#------------------------------------------------------------------------------
generate: build/generate-stamp
build/generate-stamp: $(wildcard source/reference/*.rst)
mkdir -p build
ifeq ($(NEED_AUTOSUMMARY),1)
$(PYTHON) \
./sphinxext/autosummary_generate.py source/reference/*.rst \
-p dump.xml -o source/reference/generated
endif
touch build/generate-stamp
html: generate
mkdir -p build/html build/doctrees
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) build/html
$(PYTHON) postprocess.py html build/html/*.html
@echo
@echo "Build finished. The HTML pages are in build/html."
pickle: generate
mkdir -p build/pickle build/doctrees
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) build/pickle
@echo
@echo "Build finished; now you can process the pickle files or run"
@echo " sphinx-web build/pickle"
@echo "to start the sphinx-web server."
web: pickle
htmlhelp: generate
mkdir -p build/htmlhelp build/doctrees
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) build/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in build/htmlhelp."
htmlhelp-build: htmlhelp build/htmlhelp/numpy.chm
%.chm: %.hhp
-hhc.exe $^
qthelp: generate
mkdir -p build/qthelp build/doctrees
$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) build/qthelp
latex: generate
mkdir -p build/latex build/doctrees
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) build/latex
$(PYTHON) postprocess.py tex build/latex/*.tex
perl -pi -e 's/\t(latex.*|pdflatex) (.*)/\t-$$1 -interaction batchmode $$2/' build/latex/Makefile
@echo
@echo "Build finished; the LaTeX files are in build/latex."
@echo "Run \`make all-pdf' or \`make all-ps' in that directory to" \
"run these through (pdf)latex."
coverage: build
mkdir -p build/coverage build/doctrees
$(SPHINXBUILD) -b coverage $(ALLSPHINXOPTS) build/coverage
@echo "Coverage finished; see c.txt and python.txt in build/coverage"
changes: generate
mkdir -p build/changes build/doctrees
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) build/changes
@echo
@echo "The overview file is in build/changes."
linkcheck: generate
mkdir -p build/linkcheck build/doctrees
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) build/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in build/linkcheck/output.txt."

View file

@ -1,59 +0,0 @@
#!/usr/bin/env python
"""
%prog MODE FILES...
Post-processes HTML and Latex files output by Sphinx.
MODE is either 'html' or 'tex'.
"""
import re, optparse
def main():
p = optparse.OptionParser(__doc__)
options, args = p.parse_args()
if len(args) < 1:
p.error('no mode given')
mode = args.pop(0)
if mode not in ('html', 'tex'):
p.error('unknown mode %s' % mode)
for fn in args:
f = open(fn, 'r')
try:
if mode == 'html':
lines = process_html(fn, f.readlines())
elif mode == 'tex':
lines = process_tex(f.readlines())
finally:
f.close()
f = open(fn, 'w')
f.write("".join(lines))
f.close()
def process_html(fn, lines):
return lines
def process_tex(lines):
"""
Remove unnecessary section titles from the LaTeX file.
"""
new_lines = []
for line in lines:
if (line.startswith(r'\section{numpy.')
or line.startswith(r'\subsection{numpy.')
or line.startswith(r'\subsubsection{numpy.')
or line.startswith(r'\paragraph{numpy.')
or line.startswith(r'\subparagraph{numpy.')
):
pass # skip!
else:
new_lines.append(line)
return new_lines
if __name__ == "__main__":
main()

View file

@ -1,278 +0,0 @@
=========================
NumPy 1.3.0 Release Notes
=========================
This minor includes numerous bug fixes, official python 2.6 support, and
several new features such as generalized ufuncs.
Highlights
==========
Python 2.6 support
~~~~~~~~~~~~~~~~~~
Python 2.6 is now supported on all previously supported platforms, including
windows.
http://www.python.org/dev/peps/pep-0361/
Generalized ufuncs
~~~~~~~~~~~~~~~~~~
There is a general need for looping over not only functions on scalars but also
over functions on vectors (or arrays), as explained on
http://scipy.org/scipy/numpy/wiki/GeneralLoopingFunctions. We propose to
realize this concept by generalizing the universal functions (ufuncs), and
provide a C implementation that adds ~500 lines to the numpy code base. In
current (specialized) ufuncs, the elementary function is limited to
element-by-element operations, whereas the generalized version supports
"sub-array" by "sub-array" operations. The Perl vector library PDL provides a
similar functionality and its terms are re-used in the following.
Each generalized ufunc has information associated with it that states what the
"core" dimensionality of the inputs is, as well as the corresponding
dimensionality of the outputs (the element-wise ufuncs have zero core
dimensions). The list of the core dimensions for all arguments is called the
"signature" of a ufunc. For example, the ufunc numpy.add has signature
"(),()->()" defining two scalar inputs and one scalar output.
Another example is (see the GeneralLoopingFunctions page) the function
inner1d(a,b) with a signature of "(i),(i)->()". This applies the inner product
along the last axis of each input, but keeps the remaining indices intact. For
example, where a is of shape (3,5,N) and b is of shape (5,N), this will return
an output of shape (3,5). The underlying elementary function is called 3*5
times. In the signature, we specify one core dimension "(i)" for each input and
zero core dimensions "()" for the output, since it takes two 1-d arrays and
returns a scalar. By using the same name "i", we specify that the two
corresponding dimensions should be of the same size (or one of them is of size
1 and will be broadcasted).
The dimensions beyond the core dimensions are called "loop" dimensions. In the
above example, this corresponds to (3,5).
The usual numpy "broadcasting" rules apply, where the signature determines how
the dimensions of each input/output object are split into core and loop
dimensions:
While an input array has a smaller dimensionality than the corresponding number
of core dimensions, 1's are pre-pended to its shape. The core dimensions are
removed from all inputs and the remaining dimensions are broadcasted; defining
the loop dimensions. The output is given by the loop dimensions plus the
output core dimensions.
Experimental Windows 64 bits support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Numpy can now be built on windows 64 bits (amd64 only, not IA64), with both MS
compilers and mingw-w64 compilers:
This is *highly experimental*: DO NOT USE FOR PRODUCTION USE. See INSTALL.txt,
Windows 64 bits section for more information on limitations and how to build it
by yourself.
New features
============
Formatting issues
~~~~~~~~~~~~~~~~~
Float formatting is now handled by numpy instead of the C runtime: this enables
locale independent formatting, more robust fromstring and related methods.
Special values (inf and nan) are also more consistent across platforms (nan vs
IND/NaN, etc...), and more consistent with recent python formatting work (in
2.6 and later).
Nan handling in max/min
~~~~~~~~~~~~~~~~~~~~~~~
The maximum/minimum ufuncs now reliably propagate nans. If one of the
arguments is a nan, then nan is retured. This affects np.min/np.max, amin/amax
and the array methods max/min. New ufuncs fmax and fmin have been added to deal
with non-propagating nans.
Nan handling in sign
~~~~~~~~~~~~~~~~~~~~
The ufunc sign now returns nan for the sign of anan.
New ufuncs
~~~~~~~~~~
#. fmax - same as maximum for integer types and non-nan floats. Returns the
non-nan argument if one argument is nan and returns nan if both arguments
are nan.
#. fmin - same as minimum for integer types and non-nan floats. Returns the
non-nan argument if one argument is nan and returns nan if both arguments
are nan.
#. deg2rad - converts degrees to radians, same as the radians ufunc.
#. rad2deg - converts radians to degrees, same as the degrees ufunc.
#. log2 - base 2 logarithm.
#. exp2 - base 2 exponential.
#. trunc - truncate floats to nearest integer towards zero.
#. logaddexp - add numbers stored as logarithms and return the logarithm
of the result.
#. logaddexp2 - add numbers stored as base 2 logarithms and return the base 2
logarithm of the result result.
Masked arrays
~~~~~~~~~~~~~
Several new features and bug fixes, including:
* structured arrays should now be fully supported by MaskedArray
(r6463, r6324, r6305, r6300, r6294...)
* Minor bug fixes (r6356, r6352, r6335, r6299, r6298)
* Improved support for __iter__ (r6326)
* made baseclass, sharedmask and hardmask accesible to the user (but
read-only)
* doc update
gfortran support on windows
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Gfortran can now be used as a fortran compiler for numpy on windows, even when
the C compiler is Visual Studio (VS 2005 and above; VS 2003 will NOT work).
Gfortran + Visual studio does not work on windows 64 bits (but gcc + gfortran
does). It is unclear whether it will be possible to use gfortran and visual
studio at all on x64.
Arch option for windows binary
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Automatic arch detection can now be bypassed from the command line for the superpack installed:
numpy-1.3.0-superpack-win32.exe /arch=nosse
will install a numpy which works on any x86, even if the running computer
supports SSE set.
Deprecated features
===================
Histogram
~~~~~~~~~
The semantics of histogram has been modified to fix long-standing issues
with outliers handling. The main changes concern
#. the definition of the bin edges, now including the rightmost edge, and
#. the handling of upper outliers, now ignored rather than tallied in the
rightmost bin.
The previous behavior is still accessible using `new=False`, but this is
deprecated, and will be removed entirely in 1.4.0.
Documentation changes
=====================
A lot of documentation has been added. Both user guide and references can be
built from sphinx.
New C API
=========
Multiarray API
~~~~~~~~~~~~~~
The following functions have been added to the multiarray C API:
* PyArray_GetEndianness: to get runtime endianness
Ufunc API
~~~~~~~~~~~~~~
The following functions have been added to the ufunc API:
* PyUFunc_FromFuncAndDataAndSignature: to declare a more general ufunc
(generalized ufunc).
New defines
~~~~~~~~~~~
New public C defines are available for ARCH specific code through numpy/npy_cpu.h:
* NPY_CPU_X86: x86 arch (32 bits)
* NPY_CPU_AMD64: amd64 arch (x86_64, NOT Itanium)
* NPY_CPU_PPC: 32 bits ppc
* NPY_CPU_PPC64: 64 bits ppc
* NPY_CPU_SPARC: 32 bits sparc
* NPY_CPU_SPARC64: 64 bits sparc
* NPY_CPU_S390: S390
* NPY_CPU_IA64: ia64
* NPY_CPU_PARISC: PARISC
New macros for CPU endianness has been added as well (see internal changes
below for details):
* NPY_BYTE_ORDER: integer
* NPY_LITTLE_ENDIAN/NPY_BIG_ENDIAN defines
Those provide portable alternatives to glibc endian.h macros for platforms
without it.
Portable NAN, INFINITY, etc...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
npy_math.h now makes available several portable macro to get NAN, INFINITY:
* NPY_NAN: equivalent to NAN, which is a GNU extension
* NPY_INFINITY: equivalent to C99 INFINITY
* NPY_PZERO, NPY_NZERO: positive and negative zero respectively
Corresponding single and extended precision macros are available as well. All
references to NAN, or home-grown computation of NAN on the fly have been
removed for consistency.
Internal changes
================
numpy.core math configuration revamp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This should make the porting to new platforms easier, and more robust. In
particular, the configuration stage does not need to execute any code on the
target platform, which is a first step toward cross-compilation.
http://projects.scipy.org/numpy/browser/trunk/doc/neps/math_config_clean.txt
umath refactor
~~~~~~~~~~~~~~
A lot of code cleanup for umath/ufunc code (charris).
Improvements to build warnings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Numpy can now build with -W -Wall without warnings
http://projects.scipy.org/numpy/browser/trunk/doc/neps/warnfix.txt
Separate core math library
~~~~~~~~~~~~~~~~~~~~~~~~~~
The core math functions (sin, cos, etc... for basic C types) have been put into
a separate library; it acts as a compatibility layer, to support most C99 maths
functions (real only for now). The library includes platform-specific fixes for
various maths functions, such as using those versions should be more robust
than using your platform functions directly. The API for existing functions is
exactly the same as the C99 math functions API; the only difference is the npy
prefix (npy_cos vs cos).
The core library will be made available to any extension in 1.4.0.
CPU arch detection
~~~~~~~~~~~~~~~~~~
npy_cpu.h defines numpy specific CPU defines, such as NPY_CPU_X86, etc...
Those are portable across OS and toolchains, and set up when the header is
parsed, so that they can be safely used even in the case of cross-compilation
(the values is not set when numpy is built), or for multi-arch binaries (e.g.
fat binaries on Max OS X).
npy_endian.h defines numpy specific endianness defines, modeled on the glibc
endian.h. NPY_BYTE_ORDER is equivalent to BYTE_ORDER, and one of
NPY_LITTLE_ENDIAN or NPY_BIG_ENDIAN is defined. As for CPU archs, those are set
when the header is parsed by the compiler, and as such can be used for
cross-compilation and multi-arch binaries.

View file

@ -1,238 +0,0 @@
=========================
NumPy 1.4.0 Release Notes
=========================
This minor includes numerous bug fixes, as well as a few new features. It
is backward compatible with 1.3.0 release.
Highlights
==========
* New datetime dtype support to deal with dates in arrays
* Faster import time
* Extended array wrapping mechanism for ufuncs
* New Neighborhood iterator (C-level only)
* C99-like complex functions in npymath
New features
============
Extended array wrapping mechanism for ufuncs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
An __array_prepare__ method has been added to ndarray to provide subclasses
greater flexibility to interact with ufuncs and ufunc-like functions. ndarray
already provided __array_wrap__, which allowed subclasses to set the array type
for the result and populate metadata on the way out of the ufunc (as seen in
the implementation of MaskedArray). For some applications it is necessary to
provide checks and populate metadata *on the way in*. __array_prepare__ is
therefore called just after the ufunc has initialized the output array but
before computing the results and populating it. This way, checks can be made
and errors raised before operations which may modify data in place.
Automatic detection of forward incompatibilities
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Previously, if an extension was built against a version N of NumPy, and used on
a system with NumPy M < N, the import_array was successfull, which could cause
crashes because the version M does not have a function in N. Starting from
NumPy 1.4.0, this will cause a failure in import_array, so the error will be
catched early on.
New iterators
~~~~~~~~~~~~~
A new neighborhood iterator has been added to the C API. It can be used to
iterate over the items in a neighborhood of an array, and can handle boundaries
conditions automatically. Zero and one padding are available, as well as
arbitrary constant value, mirror and circular padding.
New polynomial support
~~~~~~~~~~~~~~~~~~~~~~
New modules chebyshev and polynomial have been added. The new polynomial module
is not compatible with the current polynomial support in numpy, but is much
like the new chebyshev module. The most noticeable difference to most will
be that coefficients are specified from low to high power, that the low
level functions do *not* work with the Chebyshev and Polynomial classes as
arguements, and that the Chebyshev and Polynomial classes include a domain.
Mapping between domains is a linear substitution and the two classes can be
converted one to the other, allowing, for instance, a Chebyshev series in
one domain to be expanded as a polynomial in another domain. The new classes
should generally be used instead of the low level functions, the latter are
provided for those who wish to build their own classes.
The new modules are not automatically imported into the numpy namespace,
they must be explicitly brought in with an "import numpy.polynomial"
statement.
New C API
~~~~~~~~~
The following C functions have been added to the C API:
#. PyArray_GetNDArrayCFeatureVersion: return the *API* version of the
loaded numpy.
#. PyArray_Correlate2 - like PyArray_Correlate, but implements the usual
definition of correlation. Inputs are not swapped, and conjugate is
taken for complex arrays.
#. PyArray_NeighborhoodIterNew - a new iterator to iterate over a
neighborhood of a point, with automatic boundaries handling. It is
documented in the iterators section of the C-API reference, and you can
find some examples in the multiarray_test.c.src file in numpy.core.
New ufuncs
~~~~~~~~~~
The following ufuncs have been added to the C API:
#. copysign - return the value of the first argument with the sign copied
from the second argument.
#. nextafter - return the next representable floating point value of the
first argument toward the second argument.
New defines
~~~~~~~~~~~
The alpha processor is now defined and available in numpy/npy_cpu.h. The
failed detection of the PARISC processor has been fixed. The defines are:
#. NPY_CPU_HPPA: PARISC
#. NPY_CPU_ALPHA: Alpha
Testing
~~~~~~~
#. deprecated decorator: this decorator may be used to avoid cluttering
testing output while testing DeprecationWarning is effectively raised by
the decorated test.
#. assert_array_almost_equal_nulps: new method to compare two arrays of
floating point values. With this function, two values are considered
close if there are not many representable floating point values in
between, thus being more robust than assert_array_almost_equal when the
values fluctuate a lot.
#. assert_array_max_ulp: raise an assertion if there are more than N
representable numbers between two floating point values.
#. assert_warns: raise an AssertionError if a callable does not generate a
warning of the appropriate class, without altering the warning state.
Reusing npymath
~~~~~~~~~~~~~~~
In 1.3.0, we started putting portable C math routines in npymath library, so
that people can use those to write portable extensions. Unfortunately, it was
not possible to easily link against this library: in 1.4.0, support has been
added to numpy.distutils so that 3rd party can reuse this library. See coremath
documentation for more information.
Improved set operations
~~~~~~~~~~~~~~~~~~~~~~~
In previous versions of NumPy some set functions (intersect1d,
setxor1d, setdiff1d and setmember1d) could return incorrect results if
the input arrays contained duplicate items. These now work correctly
for input arrays with duplicates. setmember1d has been renamed to
in1d, as with the change to accept arrays with duplicates it is
no longer a set operation, and is conceptually similar to an
elementwise version of the Python operator 'in'. All of these
functions now accept the boolean keyword assume_unique. This is False
by default, but can be set True if the input arrays are known not
to contain duplicates, which can increase the functions' execution
speed.
Improvements
============
#. numpy import is noticeably faster (from 20 to 30 % depending on the
platform and computer)
#. The sort functions now sort nans to the end.
* Real sort order is [R, nan]
* Complex sort order is [R + Rj, R + nanj, nan + Rj, nan + nanj]
Complex numbers with the same nan placements are sorted according to
the non-nan part if it exists.
#. The type comparison functions have been made consistent with the new
sort order of nans. Searchsorted now works with sorted arrays
containing nan values.
#. Complex division has been made more resistent to overflow.
#. Complex floor division has been made more resistent to overflow.
Deprecations
============
The following functions are deprecated:
#. correlate: it takes a new keyword argument old_behavior. When True (the
default), it returns the same result as before. When False, compute the
conventional correlation, and take the conjugate for complex arrays. The
old behavior will be removed in NumPy 1.5, and raises a
DeprecationWarning in 1.4.
#. unique1d: use unique instead. unique1d raises a deprecation
warning in 1.4, and will be removed in 1.5.
#. intersect1d_nu: use intersect1d instead. intersect1d_nu raises
a deprecation warning in 1.4, and will be removed in 1.5.
#. setmember1d: use in1d instead. setmember1d raises a deprecation
warning in 1.4, and will be removed in 1.5.
The following raise errors:
#. When operating on 0-d arrays, ``numpy.max`` and other functions accept
only ``axis=0``, ``axis=-1`` and ``axis=None``. Using an out-of-bounds
axes is an indication of a bug, so Numpy raises an error for these cases
now.
#. Specifying ``axis > MAX_DIMS`` is no longer allowed; Numpy raises now an
error instead of behaving similarly as for ``axis=None``.
Internal changes
================
Use C99 complex functions when available
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The numpy complex types are now guaranteed to be ABI compatible with C99
complex type, if availble on the platform. Moreoever, the complex ufunc now use
the platform C99 functions intead of our own.
split multiarray and umath source code
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The source code of multiarray and umath has been split into separate logic
compilation units. This should make the source code more amenable for
newcomers.
Separate compilation
~~~~~~~~~~~~~~~~~~~~
By default, every file of multiarray (and umath) is merged into one for
compilation as was the case before, but if NPY_SEPARATE_COMPILATION env
variable is set to a non-negative value, experimental individual compilation of
each file is enabled. This makes the compile/debug cycle much faster when
working on core numpy.
Separate core math library
~~~~~~~~~~~~~~~~~~~~~~~~~~
New functions which have been added:
* npy_copysign
* npy_nextafter
* npy_cpack
* npy_creal
* npy_cimag
* npy_cabs
* npy_cexp
* npy_clog
* npy_cpow
* npy_csqr
* npy_ccos
* npy_csin

View file

@ -1,106 +0,0 @@
=========================
NumPy 1.5.0 Release Notes
=========================
Plans
=====
This release has the following aims:
* Python 3 compatibility
* :pep:`3118` compatibility
Highlights
==========
New features
============
Warning on casting complex to real
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Numpy now emits a `numpy.ComplexWarning` when a complex number is cast
into a real number. For example:
>>> x = np.array([1,2,3])
>>> x[:2] = np.array([1+2j, 1-2j])
ComplexWarning: Casting complex values to real discards the imaginary part
The cast indeed discards the imaginary part, and this may not be the
intended behavior in all cases, hence the warning. This warning can be
turned off in the standard way:
>>> import warnings
>>> warnings.simplefilter("ignore", np.ComplexWarning)
Dot method for ndarrays
~~~~~~~~~~~~~~~~~~~~~~~
Ndarrays now have the dot product also as a method, which allows writing
chains of matrix products as
>>> a.dot(b).dot(c)
instead of the longer alternative
>>> np.dot(a, np.dot(b, c))
linalg.slogdet function
~~~~~~~~~~~~~~~~~~~~~~~
The slogdet function returns the sign and logarithm of the determinant
of a matrix. Because the determinant may involve the product of many
small/large values, the result is often more accurate than that obtained
by simple multiplication.
new header
~~~~~~~~~~
The new header file ndarraytypes.h contains the symbols from
ndarrayobject.h that do not depend on the PY_ARRAY_UNIQUE_SYMBOL and
NO_IMPORT/_ARRAY macros. Broadly, these symbols are types, typedefs,
and enumerations; the array function calls are left in
ndarrayobject.h. This allows users to include array-related types and
enumerations without needing to concern themselves with the macro
expansions and their side- effects.
Changes
=======
polynomial.polynomial
---------------------
* The polyint and polyder functions now check that the specified number
integrations or derivations is a non-negative integer. The number 0 is
a valid value for both functions.
* A degree method has been added to the Polynomial class.
* A trimdeg method has been added to the Polynomial class. It operates like
truncate except that the argument is the desired degree of the result,
not the number of coefficients.
* Polynomial.fit now uses None as the default domain for the fit. The default
Polynomial domain can be specified by using [] as the domain value.
* Weights can be used in both polyfit and Polynomial.fit
* A linspace method has been added to the Polynomial class to ease plotting.
polynomial.chebyshev
--------------------
* The chebint and chebder functions now check that the specified number
integrations or derivations is a non-negative integer. The number 0 is
a valid value for both functions.
* A degree method has been added to the Chebyshev class.
* A trimdeg method has been added to the Chebyshev class. It operates like
truncate except that the argument is the desired degree of the result,
not the number of coefficients.
* Chebyshev.fit now uses None as the default domain for the fit. The default
Chebyshev domain can be specified by using [] as the domain value.
* Weights can be used in both chebfit and Chebyshev.fit
* A linspace method has been added to the Chebyshev class to ease plotting.
histogram
---------
After a two years transition period, the old behavior of the histogram function
has been phased out, and the "new" keyword has been removed.

View file

@ -1,129 +0,0 @@
.. vim:syntax=rst
Introduction
============
This document proposes some enhancements for numpy and scipy releases.
Successive numpy and scipy releases are too far apart from a time point of
view - some people who are in the numpy release team feel that it cannot
improve without a bit more formal release process. The main proposal is to
follow a time-based release, with expected dates for code freeze, beta and rc.
The goal is two folds: make release more predictable, and move the code forward.
Rationale
=========
Right now, the release process of numpy is relatively organic. When some
features are there, we may decide to make a new release. Because there is not
fixed schedule, people don't really know when new features and bug fixes will
go into a release. More significantly, having an expected release schedule
helps to *coordinate* efforts: at the beginning of a cycle, everybody can jump
in and put new code, even break things if needed. But after some point, only
bug fixes are accepted: this makes beta and RC releases much easier; calming
things down toward the release date helps focusing on bugs and regressions
Proposal
========
Time schedule
-------------
The proposed schedule is to release numpy every 9 weeks - the exact period can
be tweaked if it ends up not working as expected. There will be several stages
for the cycle:
* Development: anything can happen (by anything, we mean as currently
done). The focus is on new features, refactoring, etc...
* Beta: no new features. No bug fixing which requires heavy changes.
regression fixes which appear on supported platforms and were not
caught earlier.
* Polish/RC: only docstring changes and blocker regressions are allowed.
The schedule would be as follows:
+------+-----------------+-----------------+------------------+
| Week | 1.3.0 | 1.4.0 | Release time |
+======+=================+=================+==================+
| 1 | Development | | |
+------+-----------------+-----------------+------------------+
| 2 | Development | | |
+------+-----------------+-----------------+------------------+
| 3 | Development | | |
+------+-----------------+-----------------+------------------+
| 4 | Development | | |
+------+-----------------+-----------------+------------------+
| 5 | Development | | |
+------+-----------------+-----------------+------------------+
| 6 | Development | | |
+------+-----------------+-----------------+------------------+
| 7 | Beta | | |
+------+-----------------+-----------------+------------------+
| 8 | Beta | | |
+------+-----------------+-----------------+------------------+
| 9 | Beta | | 1.3.0 released |
+------+-----------------+-----------------+------------------+
| 10 | Polish | Development | |
+------+-----------------+-----------------+------------------+
| 11 | Polish | Development | |
+------+-----------------+-----------------+------------------+
| 12 | Polish | Development | |
+------+-----------------+-----------------+------------------+
| 13 | Polish | Development | |
+------+-----------------+-----------------+------------------+
| 14 | | Development | |
+------+-----------------+-----------------+------------------+
| 15 | | Development | |
+------+-----------------+-----------------+------------------+
| 16 | | Beta | |
+------+-----------------+-----------------+------------------+
| 17 | | Beta | |
+------+-----------------+-----------------+------------------+
| 18 | | Beta | 1.4.0 released |
+------+-----------------+-----------------+------------------+
Each stage can be defined as follows:
+------------------+-------------+----------------+----------------+
| | Development | Beta | Polish |
+==================+=============+================+================+
| Python Frozen | | slushy | Y |
+------------------+-------------+----------------+----------------+
| Docstring Frozen | | slushy | thicker slush |
+------------------+-------------+----------------+----------------+
| C code Frozen | | thicker slush | thicker slush |
+------------------+-------------+----------------+----------------+
Terminology:
* slushy: you can change it if you beg the release team and it's really
important and you coordinate with docs/translations; no "big"
changes.
* thicker slush: you can change it if it's an open bug marked
showstopper for the Polish release, you beg the release team, the
change is very very small yet very very important, and you feel
extremely guilty about your transgressions.
The different frozen states are intended to be gradients. The exact meaning is
decided by the release manager: he has the last word on what's go in, what
doesn't. The proposed schedule means that there would be at most 12 weeks
between putting code into the source code repository and being released.
Release team
------------
For every release, there would be at least one release manager. We propose to
rotate the release manager: rotation means it is not always the same person
doing the dirty job, and it should also keep the release manager honest.
References
==========
* Proposed schedule for Gnome from Havoc Pennington (one of the core
GTK and Gnome manager):
http://mail.gnome.org/archives/gnome-hackers/2002-June/msg00041.html
The proposed schedule is heavily based on this email
* http://live.gnome.org/ReleasePlanning/Freezes

View file

@ -1,183 +0,0 @@
@import "default.css";
/**
* Spacing fixes
*/
div.body p, div.body dd, div.body li {
line-height: 125%;
}
ul.simple {
margin-top: 0;
margin-bottom: 0;
padding-top: 0;
padding-bottom: 0;
}
/* spacing around blockquoted fields in parameters/attributes/returns */
td.field-body > blockquote {
margin-top: 0.1em;
margin-bottom: 0.5em;
}
/* spacing around example code */
div.highlight > pre {
padding: 2px 5px 2px 5px;
}
/* spacing in see also definition lists */
dl.last > dd {
margin-top: 1px;
margin-bottom: 5px;
margin-left: 30px;
}
/**
* Hide dummy toctrees
*/
ul {
padding-top: 0;
padding-bottom: 0;
margin-top: 0;
margin-bottom: 0;
}
ul li {
padding-top: 0;
padding-bottom: 0;
margin-top: 0;
margin-bottom: 0;
}
ul li a.reference {
padding-top: 0;
padding-bottom: 0;
margin-top: 0;
margin-bottom: 0;
}
/**
* Make high-level subsections easier to distinguish from top-level ones
*/
div.body h3 {
background-color: transparent;
}
div.body h4 {
border: none;
background-color: transparent;
}
/**
* Scipy colors
*/
body {
background-color: rgb(100,135,220);
}
div.document {
background-color: rgb(230,230,230);
}
div.sphinxsidebar {
background-color: rgb(230,230,230);
}
div.related {
background-color: rgb(100,135,220);
}
div.sphinxsidebar h3 {
color: rgb(0,102,204);
}
div.sphinxsidebar h3 a {
color: rgb(0,102,204);
}
div.sphinxsidebar h4 {
color: rgb(0,82,194);
}
div.sphinxsidebar p {
color: black;
}
div.sphinxsidebar a {
color: #355f7c;
}
div.sphinxsidebar ul.want-points {
list-style: disc;
}
.field-list th {
color: rgb(0,102,204);
}
/**
* Extra admonitions
*/
div.tip {
background-color: #ffffe4;
border: 1px solid #ee6;
}
div.plot-output {
clear-after: both;
}
div.plot-output .figure {
float: left;
text-align: center;
margin-bottom: 0;
padding-bottom: 0;
}
div.plot-output .caption {
margin-top: 2;
padding-top: 0;
}
div.plot-output p.admonition-title {
display: none;
}
div.plot-output:after {
content: "";
display: block;
height: 0;
clear: both;
}
/*
div.admonition-example {
background-color: #e4ffe4;
border: 1px solid #ccc;
}*/
/**
* Styling for field lists
*/
table.field-list th {
border-left: 1px solid #aaa !important;
padding-left: 5px;
}
table.field-list {
border-collapse: separate;
border-spacing: 10px;
}
/**
* Styling for footnotes
*/
table.footnote td, table.footnote th {
border: none;
}

View file

@ -1,23 +0,0 @@
{% extends "!autosummary/class.rst" %}
{% block methods %}
{% if methods %}
.. HACK
.. autosummary::
:toctree:
{% for item in methods %}
{{ name }}.{{ item }}
{%- endfor %}
{% endif %}
{% endblock %}
{% block attributes %}
{% if attributes %}
.. HACK
.. autosummary::
:toctree:
{% for item in attributes %}
{{ name }}.{{ item }}
{%- endfor %}
{% endif %}
{% endblock %}

View file

@ -1,56 +0,0 @@
{% extends "defindex.html" %}
{% block tables %}
<p><strong>Parts of the documentation:</strong></p>
<table class="contentstable" align="center"><tr>
<td width="50%">
<p class="biglink"><a class="biglink" href="{{ pathto("user/index") }}">Numpy User Guide</a><br/>
<span class="linkdescr">start here</span></p>
<p class="biglink"><a class="biglink" href="{{ pathto("reference/index") }}">Numpy Reference</a><br/>
<span class="linkdescr">reference documentation</span></p>
</td></tr>
</table>
<p><strong>Indices and tables:</strong></p>
<table class="contentstable" align="center"><tr>
<td width="50%">
<p class="biglink"><a class="biglink" href="{{ pathto("modindex") }}">Module Index</a><br/>
<span class="linkdescr">quick access to all modules</span></p>
<p class="biglink"><a class="biglink" href="{{ pathto("genindex") }}">General Index</a><br/>
<span class="linkdescr">all functions, classes, terms</span></p>
<p class="biglink"><a class="biglink" href="{{ pathto("glossary") }}">Glossary</a><br/>
<span class="linkdescr">the most important terms explained</span></p>
</td><td width="50%">
<p class="biglink"><a class="biglink" href="{{ pathto("search") }}">Search page</a><br/>
<span class="linkdescr">search this documentation</span></p>
<p class="biglink"><a class="biglink" href="{{ pathto("contents") }}">Complete Table of Contents</a><br/>
<span class="linkdescr">lists all sections and subsections</span></p>
</td></tr>
</table>
<p><strong>Meta information:</strong></p>
<table class="contentstable" align="center"><tr>
<td width="50%">
<p class="biglink"><a class="biglink" href="{{ pathto("bugs") }}">Reporting bugs</a></p>
<p class="biglink"><a class="biglink" href="{{ pathto("about") }}">About NumPy</a></p>
</td><td width="50%">
<p class="biglink"><a class="biglink" href="{{ pathto("release") }}">Release Notes</a></p>
<p class="biglink"><a class="biglink" href="{{ pathto("license") }}">License of Numpy</a></p>
</td></tr>
</table>
<h2>Acknowledgements</h2>
<p>
Large parts of this manual originate from Travis E. Oliphant's book
<a href="http://www.tramy.us/">"Guide to Numpy"</a> (which generously entered
Public Domain in August 2008). The reference documentation for many of
the functions are written by numerous contributors and developers of
Numpy, both prior to and during the
<a href="http://scipy.org/Developer_Zone/DocMarathon2008">Numpy Documentation Marathon</a>.
</p>
<p>
The Documentation Marathon is still ongoing. Please help us write
better documentation for Numpy by joining it! Instructions on how to
join and what to do can be found
<a href="http://scipy.org/Developer_Zone/DocMarathon2008">on the scipy.org website</a>.
</p>
{% endblock %}

View file

@ -1,5 +0,0 @@
<h3>Resources</h3>
<ul>
<li><a href="http://scipy.org/">Scipy.org website</a></li>
<li>&nbsp;</li>
</ul>

View file

@ -1,17 +0,0 @@
{% extends "!layout.html" %}
{% block rootrellink %}
<li><a href="{{ pathto('index') }}">{{ shorttitle }}</a>{{ reldelim1 }}</li>
{% endblock %}
{% block sidebarsearch %}
{%- if sourcename %}
<ul class="this-page-menu">
{%- if 'reference/generated' in sourcename %}
<li><a href="/numpy/docs/{{ sourcename.replace('reference/generated/', '').replace('.txt', '') |e }}">{{_('Edit page')}}</a></li>
{%- else %}
<li><a href="/numpy/docs/numpy-docs/{{ sourcename.replace('.txt', '.rst') |e }}">{{_('Edit page')}}</a></li>
{%- endif %}
</ul>
{%- endif %}
{{ super() }}
{% endblock %}

View file

@ -1,65 +0,0 @@
About NumPy
===========
`NumPy <http://www.scipy.org/NumpPy/>`__ is the fundamental package
needed for scientific computing with Python. This package contains:
- a powerful N-dimensional :ref:`array object <arrays>`
- sophisticated :ref:`(broadcasting) functions <ufuncs>`
- basic :ref:`linear algebra functions <routines.linalg>`
- basic :ref:`Fourier transforms <routines.fft>`
- sophisticated :ref:`random number capabilities <routines.random>`
- tools for integrating Fortran code
- tools for integrating C/C++ code
Besides its obvious scientific uses, *NumPy* can also be used as an
efficient multi-dimensional container of generic data. Arbitrary
data types can be defined. This allows *NumPy* to seamlessly and
speedily integrate with a wide variety of databases.
NumPy is a successor for two earlier scientific Python libraries:
NumPy derives from the old *Numeric* code base and can be used
as a replacement for *Numeric*. It also adds the features introduced
by *Numarray* and can also be used to replace *Numarray*.
NumPy community
---------------
Numpy is a distributed, volunteer, open-source project. *You* can help
us make it better; if you believe something should be improved either
in functionality or in documentation, don't hesitate to contact us --- or
even better, contact us and participate in fixing the problem.
Our main means of communication are:
- `scipy.org website <http://scipy.org/>`__
- `Mailing lists <http://scipy.org/Mailing_Lists>`__
- `Numpy Trac <http://projects.scipy.org/numpy>`__ (bug "tickets" go here)
More information about the development of Numpy can be found at
http://scipy.org/Developer_Zone
If you want to fix issues in this documentation, the easiest way
is to participate in `our ongoing documentation marathon
<http://scipy.org/Developer_Zone/DocMarathon2008>`__.
About this documentation
========================
Conventions
-----------
Names of classes, objects, constants, etc. are given in **boldface** font.
Often they are also links to a more detailed documentation of the
referred object.
This manual contains many examples of use, usually prefixed with the
Python prompt ``>>>`` (which is not a part of the example code). The
examples assume that you have first entered::
>>> import numpy as np
before running the examples.

View file

@ -1,23 +0,0 @@
**************
Reporting bugs
**************
File bug reports or feature requests, and make contributions
(e.g. code patches), by submitting a "ticket" on the Trac pages:
- Numpy Trac: http://scipy.org/scipy/numpy
Because of spam abuse, you must create an account on our Trac in order
to submit a ticket, then click on the "New Ticket" tab that only
appears when you have logged in. Please give as much information as
you can in the ticket. It is extremely useful if you can supply a
small self-contained code snippet that reproduces the problem. Also
specify the component, the version you are referring to and the
milestone.
Report bugs to the appropriate Trac instance (there is one for NumPy
and a different one for SciPy). There are also read-only mailing lists
for tracking the status of your bug ticket.
More information can be found on the http://scipy.org/Developer_Zone
website.

View file

@ -1,274 +0,0 @@
# -*- coding: utf-8 -*-
import sys, os, re
# Check Sphinx version
import sphinx
if sphinx.__version__ < "0.5":
raise RuntimeError("Sphinx 0.5.dev or newer required")
# -----------------------------------------------------------------------------
# General configuration
# -----------------------------------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
sys.path.insert(0, os.path.abspath('../sphinxext'))
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.pngmath', 'numpydoc',
'sphinx.ext.intersphinx', 'sphinx.ext.coverage',
'sphinx.ext.doctest',
'plot_directive']
if sphinx.__version__ >= "0.7":
extensions.append('sphinx.ext.autosummary')
else:
extensions.append('autosummary')
extensions.append('only_directives')
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The master toctree document.
#master_doc = 'index'
# General substitutions.
project = 'NumPy'
copyright = '2008-2009, The Scipy community'
# The default replacements for |version| and |release|, also used in various
# other places throughout the built documents.
#
import numpy
# The short X.Y version (including .devXXXX, rcX, b1 suffixes if present)
version = re.sub(r'(\d+\.\d+)\.\d+(.*)', r'\1\2', numpy.__version__)
version = re.sub(r'(\.dev\d+).*?$', r'\1', version)
# The full version, including alpha/beta/rc tags.
release = numpy.__version__
print version, release
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
today_fmt = '%B %d, %Y'
# List of documents that shouldn't be included in the build.
#unused_docs = []
# The reST default role (used for this markup: `text`) to use for all documents.
default_role = "autolink"
# List of directories, relative to source directories, that shouldn't be searched
# for source files.
exclude_dirs = []
# If true, '()' will be appended to :func: etc. cross-reference text.
add_function_parentheses = False
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# -----------------------------------------------------------------------------
# HTML output
# -----------------------------------------------------------------------------
# The style sheet to use for HTML and HTML Help pages. A file of that name
# must exist either in Sphinx' static/ path, or in one of the custom paths
# given in html_static_path.
html_style = 'scipy.css'
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
html_title = "%s v%s Manual (DRAFT)" % (project, version)
# The name of an image file (within the static path) to place at the top of
# the sidebar.
html_logo = 'scipyshiny_small.png'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
html_sidebars = {
'index': 'indexsidebar.html'
}
# Additional templates that should be rendered to pages, maps page names to
# template names.
html_additional_pages = {
'index': 'indexcontent.html',
}
# If false, no module index is generated.
html_use_modindex = True
# If true, the reST sources are included in the HTML build as _sources/<name>.
#html_copy_source = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
#html_use_opensearch = ''
# If nonempty, this is the file name suffix for HTML files (e.g. ".html").
#html_file_suffix = '.html'
# Output file base name for HTML help builder.
htmlhelp_basename = 'numpy'
# Pngmath should try to align formulas properly
pngmath_use_preview = True
# -----------------------------------------------------------------------------
# LaTeX output
# -----------------------------------------------------------------------------
# The paper size ('letter' or 'a4').
#latex_paper_size = 'letter'
# The font size ('10pt', '11pt' or '12pt').
#latex_font_size = '10pt'
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, document class [howto/manual]).
_stdauthor = 'Written by the NumPy community'
latex_documents = [
('reference/index', 'numpy-ref.tex', 'NumPy Reference',
_stdauthor, 'manual'),
('user/index', 'numpy-user.tex', 'NumPy User Guide',
_stdauthor, 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
#latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
#latex_use_parts = False
# Additional stuff for the LaTeX preamble.
latex_preamble = r'''
\usepackage{amsmath}
\DeclareUnicodeCharacter{00A0}{\nobreakspace}
% In the parameters section, place a newline after the Parameters
% header
\usepackage{expdlist}
\let\latexdescription=\description
\def\description{\latexdescription{}{} \breaklabel}
% Make Examples/etc section headers smaller and more compact
\makeatletter
\titleformat{\paragraph}{\normalsize\py@HeaderFamily}%
{\py@TitleColor}{0em}{\py@TitleColor}{\py@NormalColor}
\titlespacing*{\paragraph}{0pt}{1ex}{0pt}
\makeatother
% Fix footer/header
\renewcommand{\chaptermark}[1]{\markboth{\MakeUppercase{\thechapter.\ #1}}{}}
\renewcommand{\sectionmark}[1]{\markright{\MakeUppercase{\thesection.\ #1}}}
'''
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
latex_use_modindex = False
# -----------------------------------------------------------------------------
# Intersphinx configuration
# -----------------------------------------------------------------------------
intersphinx_mapping = {'http://docs.python.org/dev': None}
# -----------------------------------------------------------------------------
# Numpy extensions
# -----------------------------------------------------------------------------
# If we want to do a phantom import from an XML file for all autodocs
phantom_import_file = 'dump.xml'
# Make numpydoc to generate plots for example sections
numpydoc_use_plots = True
# -----------------------------------------------------------------------------
# Autosummary
# -----------------------------------------------------------------------------
if sphinx.__version__ >= "0.7":
import glob
autosummary_generate = glob.glob("reference/*.rst")
# -----------------------------------------------------------------------------
# Coverage checker
# -----------------------------------------------------------------------------
coverage_ignore_modules = r"""
""".split()
coverage_ignore_functions = r"""
test($|_) (some|all)true bitwise_not cumproduct pkgload
generic\.
""".split()
coverage_ignore_classes = r"""
""".split()
coverage_c_path = []
coverage_c_regexes = {}
coverage_ignore_c_items = {}
# -----------------------------------------------------------------------------
# Plots
# -----------------------------------------------------------------------------
plot_pre_code = """
import numpy as np
np.random.seed(0)
"""
plot_include_source = True
plot_formats = [('png', 100), 'pdf']
import math
phi = (math.sqrt(5) + 1)/2
import matplotlib
matplotlib.rcParams.update({
'font.size': 8,
'axes.titlesize': 8,
'axes.labelsize': 8,
'xtick.labelsize': 8,
'ytick.labelsize': 8,
'legend.fontsize': 8,
'figure.figsize': (3*phi, 3),
'figure.subplot.bottom': 0.2,
'figure.subplot.left': 0.2,
'figure.subplot.right': 0.9,
'figure.subplot.top': 0.85,
'figure.subplot.wspace': 0.4,
'text.usetex': False,
})

View file

@ -1,13 +0,0 @@
#####################
Numpy manual contents
#####################
.. toctree::
user/index
reference/index
release
about
bugs
license
glossary

View file

@ -1,14 +0,0 @@
********
Glossary
********
.. toctree::
.. glossary::
.. automodule:: numpy.doc.glossary
Jargon
------
.. automodule:: numpy.doc.jargon

View file

@ -1,35 +0,0 @@
*************
Numpy License
*************
Copyright (c) 2005, NumPy Developers
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
* Neither the name of the NumPy Developers nor the names of any
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View file

@ -1,423 +0,0 @@
#########################
Standard array subclasses
#########################
.. currentmodule:: numpy
The :class:`ndarray` in NumPy is a "new-style" Python
built-in-type. Therefore, it can be inherited from (in Python or in C)
if desired. Therefore, it can form a foundation for many useful
classes. Often whether to sub-class the array object or to simply use
the core array component as an internal part of a new class is a
difficult decision, and can be simply a matter of choice. NumPy has
several tools for simplifying how your new object interacts with other
array objects, and so the choice may not be significant in the
end. One way to simplify the question is by asking yourself if the
object you are interested in can be replaced as a single array or does
it really require two or more arrays at its core.
Note that :func:`asarray` always returns the base-class ndarray. If
you are confident that your use of the array object can handle any
subclass of an ndarray, then :func:`asanyarray` can be used to allow
subclasses to propagate more cleanly through your subroutine. In
principal a subclass could redefine any aspect of the array and
therefore, under strict guidelines, :func:`asanyarray` would rarely be
useful. However, most subclasses of the arrayobject will not
redefine certain aspects of the array object such as the buffer
interface, or the attributes of the array. One important example,
however, of why your subroutine may not be able to handle an arbitrary
subclass of an array is that matrices redefine the "*" operator to be
matrix-multiplication, rather than element-by-element multiplication.
Special attributes and methods
==============================
.. seealso:: :ref:`Subclassing ndarray <basics.subclassing>`
Numpy provides several hooks that subclasses of :class:`ndarray` can
customize:
.. function:: __array_finalize__(self)
This method is called whenever the system internally allocates a
new array from *obj*, where *obj* is a subclass (subtype) of the
:class:`ndarray`. It can be used to change attributes of *self*
after construction (so as to ensure a 2-d matrix for example), or
to update meta-information from the "parent." Subclasses inherit
a default implementation of this method that does nothing.
.. function:: __array_prepare__(array, context=None)
At the beginning of every :ref:`ufunc <ufuncs.output-type>`, this
method is called on the input object with the highest array
priority, or the output object if one was specified. The output
array is passed in and whatever is returned is passed to the ufunc.
Subclasses inherit a default implementation of this method which
simply returns the output array unmodified. Subclasses may opt to
use this method to transform the output array into an instance of
the subclass and update metadata before returning the array to the
ufunc for computation.
.. function:: __array_wrap__(array, context=None)
At the end of every :ref:`ufunc <ufuncs.output-type>`, this method
is called on the input object with the highest array priority, or
the output object if one was specified. The ufunc-computed array
is passed in and whatever is returned is passed to the user.
Subclasses inherit a default implementation of this method, which
transforms the array into a new instance of the object's class.
Subclasses may opt to use this method to transform the output array
into an instance of the subclass and update metadata before
returning the array to the user.
.. data:: __array_priority__
The value of this attribute is used to determine what type of
object to return in situations where there is more than one
possibility for the Python type of the returned object. Subclasses
inherit a default value of 1.0 for this attribute.
.. function:: __array__([dtype])
If a class having the :obj:`__array__` method is used as the output
object of an :ref:`ufunc <ufuncs.output-type>`, results will be
written to the object returned by :obj:`__array__`.
Matrix objects
==============
.. index::
single: matrix
:class:`matrix` objects inherit from the ndarray and therefore, they
have the same attributes and methods of ndarrays. There are six
important differences of matrix objects, however, that may lead to
unexpected results when you use matrices but expect them to act like
arrays:
1. Matrix objects can be created using a string notation to allow
Matlab-style syntax where spaces separate columns and semicolons
(';') separate rows.
2. Matrix objects are always two-dimensional. This has far-reaching
implications, in that m.ravel() is still two-dimensional (with a 1
in the first dimension) and item selection returns two-dimensional
objects so that sequence behavior is fundamentally different than
arrays.
3. Matrix objects over-ride multiplication to be
matrix-multiplication. **Make sure you understand this for
functions that you may want to receive matrices. Especially in
light of the fact that asanyarray(m) returns a matrix when m is
a matrix.**
4. Matrix objects over-ride power to be matrix raised to a power. The
same warning about using power inside a function that uses
asanyarray(...) to get an array object holds for this fact.
5. The default __array_priority\__ of matrix objects is 10.0, and
therefore mixed operations with ndarrays always produce matrices.
6. Matrices have special attributes which make calculations easier.
These are
.. autosummary::
:toctree: generated/
matrix.T
matrix.H
matrix.I
matrix.A
.. warning::
Matrix objects over-ride multiplication, '*', and power, '**', to
be matrix-multiplication and matrix power, respectively. If your
subroutine can accept sub-classes and you do not convert to base-
class arrays, then you must use the ufuncs multiply and power to
be sure that you are performing the correct operation for all
inputs.
The matrix class is a Python subclass of the ndarray and can be used
as a reference for how to construct your own subclass of the ndarray.
Matrices can be created from other matrices, strings, and anything
else that can be converted to an ``ndarray`` . The name "mat "is an
alias for "matrix "in NumPy.
.. autosummary::
:toctree: generated/
matrix
asmatrix
bmat
Example 1: Matrix creation from a string
>>> a=mat('1 2 3; 4 5 3')
>>> print (a*a.T).I
[[ 0.2924 -0.1345]
[-0.1345 0.0819]]
Example 2: Matrix creation from nested sequence
>>> mat([[1,5,10],[1.0,3,4j]])
matrix([[ 1.+0.j, 5.+0.j, 10.+0.j],
[ 1.+0.j, 3.+0.j, 0.+4.j]])
Example 3: Matrix creation from an array
>>> mat(random.rand(3,3)).T
matrix([[ 0.7699, 0.7922, 0.3294],
[ 0.2792, 0.0101, 0.9219],
[ 0.3398, 0.7571, 0.8197]])
Memory-mapped file arrays
=========================
.. index::
single: memory maps
.. currentmodule:: numpy
Memory-mapped files are useful for reading and/or modifying small
segments of a large file with regular layout, without reading the
entire file into memory. A simple subclass of the ndarray uses a
memory-mapped file for the data buffer of the array. For small files,
the over-head of reading the entire file into memory is typically not
significant, however for large files using memory mapping can save
considerable resources.
Memory-mapped-file arrays have one additional method (besides those
they inherit from the ndarray): :meth:`.flush() <memmap.flush>` which
must be called manually by the user to ensure that any changes to the
array actually get written to disk.
.. note::
Memory-mapped arrays use the the Python memory-map object which
(prior to Python 2.5) does not allow files to be larger than a
certain size depending on the platform. This size is always
< 2GB even on 64-bit systems.
.. autosummary::
:toctree: generated/
memmap
memmap.flush
Example:
>>> a = memmap('newfile.dat', dtype=float, mode='w+', shape=1000)
>>> a[10] = 10.0
>>> a[30] = 30.0
>>> del a
>>> b = fromfile('newfile.dat', dtype=float)
>>> print b[10], b[30]
10.0 30.0
>>> a = memmap('newfile.dat', dtype=float)
>>> print a[10], a[30]
10.0 30.0
Character arrays (:mod:`numpy.char`)
====================================
.. seealso:: :ref:`routines.array-creation.char`
.. index::
single: character arrays
.. note::
The `chararray` class exists for backwards compatibility with
Numarray, it is not recommended for new development. Starting from numpy
1.4, if one needs arrays of strings, it is recommended to use arrays of
`dtype` `object_`, `string_` or `unicode_`, and use the free functions
in the `numpy.char` module for fast vectorized string operations.
These are enhanced arrays of either :class:`string_` type or
:class:`unicode_` type. These arrays inherit from the
:class:`ndarray`, but specially-define the operations ``+``, ``*``,
and ``%`` on a (broadcasting) element-by-element basis. These
operations are not available on the standard :class:`ndarray` of
character type. In addition, the :class:`chararray` has all of the
standard :class:`string <str>` (and :class:`unicode`) methods,
executing them on an element-by-element basis. Perhaps the easiest
way to create a chararray is to use :meth:`self.view(chararray)
<ndarray.view>` where *self* is an ndarray of str or unicode
data-type. However, a chararray can also be created using the
:meth:`numpy.chararray` constructor, or via the
:func:`numpy.char.array <core.defchararray.array>` function:
.. autosummary::
:toctree: generated/
chararray
core.defchararray.array
Another difference with the standard ndarray of str data-type is
that the chararray inherits the feature introduced by Numarray that
white-space at the end of any element in the array will be ignored
on item retrieval and comparison operations.
.. _arrays.classes.rec:
Record arrays (:mod:`numpy.rec`)
================================
.. seealso:: :ref:`routines.array-creation.rec`, :ref:`routines.dtype`,
:ref:`arrays.dtypes`.
Numpy provides the :class:`recarray` class which allows accessing the
fields of a record/structured array as attributes, and a corresponding
scalar data type object :class:`record`.
.. currentmodule:: numpy
.. autosummary::
:toctree: generated/
recarray
record
Masked arrays (:mod:`numpy.ma`)
===============================
.. seealso:: :ref:`maskedarray`
Standard container class
========================
.. currentmodule:: numpy
For backward compatibility and as a standard "container "class, the
UserArray from Numeric has been brought over to NumPy and named
:class:`numpy.lib.user_array.container` The container class is a
Python class whose self.array attribute is an ndarray. Multiple
inheritance is probably easier with numpy.lib.user_array.container
than with the ndarray itself and so it is included by default. It is
not documented here beyond mentioning its existence because you are
encouraged to use the ndarray class directly if you can.
.. autosummary::
:toctree: generated/
numpy.lib.user_array.container
.. index::
single: user_array
single: container class
Array Iterators
===============
.. currentmodule:: numpy
.. index::
single: array iterator
Iterators are a powerful concept for array processing. Essentially,
iterators implement a generalized for-loop. If *myiter* is an iterator
object, then the Python code::
for val in myiter:
...
some code involving val
...
calls ``val = myiter.next()`` repeatedly until :exc:`StopIteration` is
raised by the iterator. There are several ways to iterate over an
array that may be useful: default iteration, flat iteration, and
:math:`N`-dimensional enumeration.
Default iteration
-----------------
The default iterator of an ndarray object is the default Python
iterator of a sequence type. Thus, when the array object itself is
used as an iterator. The default behavior is equivalent to::
for i in arr.shape[0]:
val = arr[i]
This default iterator selects a sub-array of dimension :math:`N-1`
from the array. This can be a useful construct for defining recursive
algorithms. To loop over the entire array requires :math:`N` for-loops.
>>> a = arange(24).reshape(3,2,4)+10
>>> for val in a:
... print 'item:', val
item: [[10 11 12 13]
[14 15 16 17]]
item: [[18 19 20 21]
[22 23 24 25]]
item: [[26 27 28 29]
[30 31 32 33]]
Flat iteration
--------------
.. autosummary::
:toctree: generated/
ndarray.flat
As mentioned previously, the flat attribute of ndarray objects returns
an iterator that will cycle over the entire array in C-style
contiguous order.
>>> for i, val in enumerate(a.flat):
... if i%5 == 0: print i, val
0 10
5 15
10 20
15 25
20 30
Here, I've used the built-in enumerate iterator to return the iterator
index as well as the value.
N-dimensional enumeration
-------------------------
.. autosummary::
:toctree: generated/
ndenumerate
Sometimes it may be useful to get the N-dimensional index while
iterating. The ndenumerate iterator can achieve this.
>>> for i, val in ndenumerate(a):
... if sum(i)%5 == 0: print i, val
(0, 0, 0) 10
(1, 1, 3) 25
(2, 0, 3) 29
(2, 1, 2) 32
Iterator for broadcasting
-------------------------
.. autosummary::
:toctree: generated/
broadcast
The general concept of broadcasting is also available from Python
using the :class:`broadcast` iterator. This object takes :math:`N`
objects as inputs and returns an iterator that returns tuples
providing each of the input sequence elements in the broadcasted
result.
>>> for val in broadcast([[1,0],[2,3]],[0,1]):
... print val
(1, 0)
(0, 1)
(2, 0)
(3, 1)

View file

@ -1,512 +0,0 @@
.. currentmodule:: numpy
.. _arrays.dtypes:
**********************************
Data type objects (:class:`dtype`)
**********************************
A data type object (an instance of :class:`numpy.dtype` class)
describes how the bytes in the fixed-size block of memory
corresponding to an array item should be interpreted. It describes the
following aspects of the data:
1. Type of the data (integer, float, Python object, etc.)
2. Size of the data (how many bytes is in *e.g.* the integer)
3. Byte order of the data (:term:`little-endian` or :term:`big-endian`)
4. If the data type is a :term:`record`, an aggregate of other
data types, (*e.g.*, describing an array item consisting of
an integer and a float),
1. what are the names of the ":term:`fields <field>`" of the record,
by which they can be :ref:`accessed <arrays.indexing.rec>`,
2. what is the data-type of each :term:`field`, and
3. which part of the memory block each field takes.
5. If the data is a sub-array, what is its shape and data type.
.. index::
pair: dtype; scalar
To describe the type of scalar data, there are several :ref:`built-in
scalar types <arrays.scalars.built-in>` in Numpy for various precision
of integers, floating-point numbers, *etc*. An item extracted from an
array, *e.g.*, by indexing, will be a Python object whose type is the
scalar type associated with the data type of the array.
Note that the scalar types are not :class:`dtype` objects, even though
they can be used in place of one whenever a data type specification is
needed in Numpy.
.. index::
pair: dtype; field
pair: dtype; record
Record data types are formed by creating a data type whose
:term:`fields` contain other data types. Each field has a name by
which it can be :ref:`accessed <arrays.indexing.rec>`. The parent data
type should be of sufficient size to contain all its fields; the
parent can for example be based on the :class:`void` type which allows
an arbitrary item size. Record data types may also contain other record
types and fixed-size sub-array data types in their fields.
.. index::
pair: dtype; sub-array
Finally, a data type can describe items that are themselves arrays of
items of another data type. These sub-arrays must, however, be of a
fixed size. If an array is created using a data-type describing a
sub-array, the dimensions of the sub-array are appended to the shape
of the array when the array is created. Sub-arrays in a field of a
record behave differently, see :ref:`arrays.indexing.rec`.
.. admonition:: Example
A simple data type containing a 32-bit big-endian integer:
(see :ref:`arrays.dtypes.constructing` for details on construction)
>>> dt = np.dtype('>i4')
>>> dt.byteorder
'>'
>>> dt.itemsize
4
>>> dt.name
'int32'
>>> dt.type is np.int32
True
The corresponding array scalar type is :class:`int32`.
.. admonition:: Example
A record data type containing a 16-character string (in field 'name')
and a sub-array of two 64-bit floating-point number (in field 'grades'):
>>> dt = np.dtype([('name', np.str_, 16), ('grades', np.float64, (2,))])
>>> dt['name']
dtype('|S16')
>>> dt['grades']
dtype(('float64',(2,)))
Items of an array of this data type are wrapped in an :ref:`array
scalar <arrays.scalars>` type that also has two fields:
>>> x = np.array([('Sarah', (8.0, 7.0)), ('John', (6.0, 7.0))], dtype=dt)
>>> x[1]
('John', [6.0, 7.0])
>>> x[1]['grades']
array([ 6., 7.])
>>> type(x[1])
<type 'numpy.void'>
>>> type(x[1]['grades'])
<type 'numpy.ndarray'>
.. _arrays.dtypes.constructing:
Specifying and constructing data types
======================================
Whenever a data-type is required in a NumPy function or method, either
a :class:`dtype` object or something that can be converted to one can
be supplied. Such conversions are done by the :class:`dtype`
constructor:
.. autosummary::
:toctree: generated/
dtype
What can be converted to a data-type object is described below:
:class:`dtype` object
.. index::
triple: dtype; construction; from dtype
Used as-is.
:const:`None`
.. index::
triple: dtype; construction; from None
The default data type: :class:`float_`.
.. index::
triple: dtype; construction; from type
Array-scalar types
The 21 built-in :ref:`array scalar type objects
<arrays.scalars.built-in>` all convert to an associated data-type object.
This is true for their sub-classes as well.
Note that not all data-type information can be supplied with a
type-object: for example, :term:`flexible` data-types have
a default *itemsize* of 0, and require an explicitly given size
to be useful.
.. admonition:: Example
>>> dt = np.dtype(np.int32) # 32-bit integer
>>> dt = np.dtype(np.complex128) # 128-bit complex floating-point number
Generic types
The generic hierarchical type objects convert to corresponding
type objects according to the associations:
===================================================== ===============
:class:`number`, :class:`inexact`, :class:`floating` :class:`float`
:class:`complexfloating` :class:`cfloat`
:class:`integer`, :class:`signedinteger` :class:`int\_`
:class:`unsignedinteger` :class:`uint`
:class:`character` :class:`string`
:class:`generic`, :class:`flexible` :class:`void`
===================================================== ===============
Built-in Python types
Several python types are equivalent to a corresponding
array scalar when used to generate a :class:`dtype` object:
================ ===============
:class:`int` :class:`int\_`
:class:`bool` :class:`bool\_`
:class:`float` :class:`float\_`
:class:`complex` :class:`cfloat`
:class:`str` :class:`string`
:class:`unicode` :class:`unicode\_`
:class:`buffer` :class:`void`
(all others) :class:`object_`
================ ===============
.. admonition:: Example
>>> dt = np.dtype(float) # Python-compatible floating-point number
>>> dt = np.dtype(int) # Python-compatible integer
>>> dt = np.dtype(object) # Python object
Types with ``.dtype``
Any type object with a ``dtype`` attribute: The attribute will be
accessed and used directly. The attribute must return something
that is convertible into a dtype object.
.. index::
triple: dtype; construction; from string
Several kinds of strings can be converted. Recognized strings can be
prepended with ``'>'`` (:term:`big-endian`), ``'<'``
(:term:`little-endian`), or ``'='`` (hardware-native, the default), to
specify the byte order.
One-character strings
Each built-in data-type has a character code
(the updated Numeric typecodes), that uniquely identifies it.
.. admonition:: Example
>>> dt = np.dtype('b') # byte, native byte order
>>> dt = np.dtype('>H') # big-endian unsigned short
>>> dt = np.dtype('<f') # little-endian single-precision float
>>> dt = np.dtype('d') # double-precision floating-point number
Array-protocol type strings (see :ref:`arrays.interface`)
The first character specifies the kind of data and the remaining
characters specify how many bytes of data. The supported kinds are
================ ========================
``'b'`` Boolean
``'i'`` (signed) integer
``'u'`` unsigned integer
``'f'`` floating-point
``'c'`` complex-floating point
``'S'``, ``'a'`` string
``'U'`` unicode
``'V'`` anything (:class:`void`)
================ ========================
.. admonition:: Example
>>> dt = np.dtype('i4') # 32-bit signed integer
>>> dt = np.dtype('f8') # 64-bit floating-point number
>>> dt = np.dtype('c16') # 128-bit complex floating-point number
>>> dt = np.dtype('a25') # 25-character string
String with comma-separated fields
Numarray introduced a short-hand notation for specifying the format
of a record as a comma-separated string of basic formats.
A basic format in this context is an optional shape specifier
followed by an array-protocol type string. Parenthesis are required
on the shape if it is greater than 1-d. NumPy allows a modification
on the format in that any string that can uniquely identify the
type can be used to specify the data-type in a field.
The generated data-type fields are named ``'f0'``, ``'f1'``, ...,
``'f<N-1>'`` where N (>1) is the number of comma-separated basic
formats in the string. If the optional shape specifier is provided,
then the data-type for the corresponding field describes a sub-array.
.. admonition:: Example
- field named ``f0`` containing a 32-bit integer
- field named ``f1`` containing a 2 x 3 sub-array
of 64-bit floating-point numbers
- field named ``f2`` containing a 32-bit floating-point number
>>> dt = np.dtype("i4, (2,3)f8, f4")
- field named ``f0`` containing a 3-character string
- field named ``f1`` containing a sub-array of shape (3,)
containing 64-bit unsigned integers
- field named ``f2`` containing a 3 x 4 sub-array
containing 10-character strings
>>> dt = np.dtype("a3, 3u8, (3,4)a10")
Type strings
Any string in :obj:`numpy.sctypeDict`.keys():
.. admonition:: Example
>>> dt = np.dtype('uint32') # 32-bit unsigned integer
>>> dt = np.dtype('Float64') # 64-bit floating-point number
.. index::
triple: dtype; construction; from tuple
``(flexible_dtype, itemsize)``
The first argument must be an object that is converted to a
flexible data-type object (one whose element size is 0), the
second argument is an integer providing the desired itemsize.
.. admonition:: Example
>>> dt = np.dtype((void, 10)) # 10-byte wide data block
>>> dt = np.dtype((str, 35)) # 35-character string
>>> dt = np.dtype(('U', 10)) # 10-character unicode string
``(fixed_dtype, shape)``
.. index::
pair: dtype; sub-array
The first argument is any object that can be converted into a
fixed-size data-type object. The second argument is the desired
shape of this type. If the shape parameter is 1, then the
data-type object is equivalent to fixed dtype. If *shape* is a
tuple, then the new dtype defines a sub-array of the given shape.
.. admonition:: Example
>>> dt = np.dtype((np.int32, (2,2))) # 2 x 2 integer sub-array
>>> dt = np.dtype(('S10', 1)) # 10-character string
>>> dt = np.dtype(('i4, (2,3)f8, f4', (2,3))) # 2 x 3 record sub-array
``(base_dtype, new_dtype)``
Both arguments must be convertible to data-type objects in this
case. The *base_dtype* is the data-type object that the new
data-type builds on. This is how you could assign named fields to
any built-in data-type object.
.. admonition:: Example
32-bit integer, whose first two bytes are interpreted as an integer
via field ``real``, and the following two bytes via field ``imag``.
>>> dt = np.dtype((np.int32,{'real':(np.int16, 0),'imag':(np.int16, 2)})
32-bit integer, which is interpreted as consisting of a sub-array
of shape ``(4,)`` containing 8-bit integers:
>>> dt = np.dtype((np.int32, (np.int8, 4)))
32-bit integer, containing fields ``r``, ``g``, ``b``, ``a`` that
interpret the 4 bytes in the integer as four unsigned integers:
>>> dt = np.dtype(('i4', [('r','u1'),('g','u1'),('b','u1'),('a','u1')]))
.. index::
triple: dtype; construction; from list
``[(field_name, field_dtype, field_shape), ...]``
*obj* should be a list of fields where each field is described by a
tuple of length 2 or 3. (Equivalent to the ``descr`` item in the
:obj:`__array_interface__` attribute.)
The first element, *field_name*, is the field name (if this is
``''`` then a standard field name, ``'f#'``, is assigned). The
field name may also be a 2-tuple of strings where the first string
is either a "title" (which may be any string or unicode string) or
meta-data for the field which can be any object, and the second
string is the "name" which must be a valid Python identifier.
The second element, *field_dtype*, can be anything that can be
interpreted as a data-type.
The optional third element *field_shape* contains the shape if this
field represents an array of the data-type in the second
element. Note that a 3-tuple with a third argument equal to 1 is
equivalent to a 2-tuple.
This style does not accept *align* in the :class:`dtype`
constructor as it is assumed that all of the memory is accounted
for by the array interface description.
.. admonition:: Example
Data-type with fields ``big`` (big-endian 32-bit integer) and
``little`` (little-endian 32-bit integer):
>>> dt = np.dtype([('big', '>i4'), ('little', '<i4')])
Data-type with fields ``R``, ``G``, ``B``, ``A``, each being an
unsigned 8-bit integer:
>>> dt = np.dtype([('R','u1'), ('G','u1'), ('B','u1'), ('A','u1')])
.. index::
triple: dtype; construction; from dict
``{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ...}``
This style has two required and two optional keys. The *names*
and *formats* keys are required. Their respective values are
equal-length lists with the field names and the field formats.
The field names must be strings and the field formats can be any
object accepted by :class:`dtype` constructor.
The optional keys in the dictionary are *offsets* and *titles* and
their values must each be lists of the same length as the *names*
and *formats* lists. The *offsets* value is a list of byte offsets
(integers) for each field, while the *titles* value is a list of
titles for each field (:const:`None` can be used if no title is
desired for that field). The *titles* can be any :class:`string`
or :class:`unicode` object and will add another entry to the
fields dictionary keyed by the title and referencing the same
field tuple which will contain the title as an additional tuple
member.
.. admonition:: Example
Data type with fields ``r``, ``g``, ``b``, ``a``, each being
a 8-bit unsigned integer:
>>> dt = np.dtype({'names': ['r','g','b','a'],
... 'formats': [uint8, uint8, uint8, uint8]})
Data type with fields ``r`` and ``b`` (with the given titles),
both being 8-bit unsigned integers, the first at byte position
0 from the start of the field and the second at position 2:
>>> dt = np.dtype({'names': ['r','b'], 'formats': ['u1', 'u1'],
... 'offsets': [0, 2],
... 'titles': ['Red pixel', 'Blue pixel']})
``{'field1': ..., 'field2': ..., ...}``
This style allows passing in the :attr:`fields <dtype.fields>`
attribute of a data-type object.
*obj* should contain string or unicode keys that refer to
``(data-type, offset)`` or ``(data-type, offset, title)`` tuples.
.. admonition:: Example
Data type containing field ``col1`` (10-character string at
byte position 0), ``col2`` (32-bit float at byte position 10),
and ``col3`` (integers at byte position 14):
>>> dt = np.dtype({'col1': ('S10', 0), 'col2': (float32, 10),
'col3': (int, 14)})
:class:`dtype`
==============
Numpy data type descriptions are instances of the :class:`dtype` class.
Attributes
----------
The type of the data is described by the following :class:`dtype` attributes:
.. autosummary::
:toctree: generated/
dtype.type
dtype.kind
dtype.char
dtype.num
dtype.str
Size of the data is in turn described by:
.. autosummary::
:toctree: generated/
dtype.name
dtype.itemsize
Endianness of this data:
.. autosummary::
:toctree: generated/
dtype.byteorder
Information about sub-data-types in a :term:`record`:
.. autosummary::
:toctree: generated/
dtype.fields
dtype.names
For data types that describe sub-arrays:
.. autosummary::
:toctree: generated/
dtype.subdtype
dtype.shape
Attributes providing additional information:
.. autosummary::
:toctree: generated/
dtype.hasobject
dtype.flags
dtype.isbuiltin
dtype.isnative
dtype.descr
dtype.alignment
Methods
-------
Data types have the following method for changing the byte order:
.. autosummary::
:toctree: generated/
dtype.newbyteorder
The following methods implement the pickle protocol:
.. autosummary::
:toctree: generated/
dtype.__reduce__
dtype.__setstate__

View file

@ -1,368 +0,0 @@
.. _arrays.indexing:
Indexing
========
.. sectionauthor:: adapted from "Guide to Numpy" by Travis E. Oliphant
.. currentmodule:: numpy
.. index:: indexing, slicing
:class:`ndarrays <ndarray>` can be indexed using the standard Python
``x[obj]`` syntax, where *x* is the array and *obj* the selection.
There are three kinds of indexing available: record access, basic
slicing, advanced indexing. Which one occurs depends on *obj*.
.. note::
In Python, ``x[(exp1, exp2, ..., expN)]`` is equivalent to
``x[exp1, exp2, ..., expN]``; the latter is just syntactic sugar
for the former.
Basic Slicing
-------------
Basic slicing extends Python's basic concept of slicing to N
dimensions. Basic slicing occurs when *obj* is a :class:`slice` object
(constructed by ``start:stop:step`` notation inside of brackets), an
integer, or a tuple of slice objects and integers. :const:`Ellipsis`
and :const:`newaxis` objects can be interspersed with these as
well. In order to remain backward compatible with a common usage in
Numeric, basic slicing is also initiated if the selection object is
any sequence (such as a :class:`list`) containing :class:`slice`
objects, the :const:`Ellipsis` object, or the :const:`newaxis` object,
but no integer arrays or other embedded sequences.
.. index::
triple: ndarray; special methods; getslice
triple: ndarray; special methods; setslice
single: ellipsis
single: newaxis
The simplest case of indexing with *N* integers returns an :ref:`array
scalar <arrays.scalars>` representing the corresponding item. As in
Python, all indices are zero-based: for the *i*-th index :math:`n_i`,
the valid range is :math:`0 \le n_i < d_i` where :math:`d_i` is the
*i*-th element of the shape of the array. Negative indices are
interpreted as counting from the end of the array (*i.e.*, if *i < 0*,
it means :math:`n_i + i`).
All arrays generated by basic slicing are always :term:`views <view>`
of the original array.
The standard rules of sequence slicing apply to basic slicing on a
per-dimension basis (including using a step index). Some useful
concepts to remember include:
- The basic slice syntax is ``i:j:k`` where *i* is the starting index,
*j* is the stopping index, and *k* is the step (:math:`k\neq0`).
This selects the *m* elements (in the corresponding dimension) with
index values *i*, *i + k*, ..., *i + (m - 1) k* where
:math:`m = q + (r\neq0)` and *q* and *r* are the quotient and remainder
obtained by dividing *j - i* by *k*: *j - i = q k + r*, so that
*i + (m - 1) k < j*.
.. admonition:: Example
>>> x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> x[1:7:2]
array([1, 3, 5])
- Negative *i* and *j* are interpreted as *n + i* and *n + j* where
*n* is the number of elements in the corresponding dimension.
Negative *k* makes stepping go towards smaller indices.
.. admonition:: Example
>>> x[-2:10]
array([8, 9])
>>> x[-3:3:-1]
array([7, 6, 5, 4])
- Assume *n* is the number of elements in the dimension being
sliced. Then, if *i* is not given it defaults to 0 for *k > 0* and
*n* for *k < 0* . If *j* is not given it defaults to *n* for *k > 0*
and -1 for *k < 0* . If *k* is not given it defaults to 1. Note that
``::`` is the same as ``:`` and means select all indices along this
axis.
.. admonition:: Example
>>> x[5:]
array([5, 6, 7, 8, 9])
- If the number of objects in the selection tuple is less than
*N* , then ``:`` is assumed for any subsequent dimensions.
.. admonition:: Example
>>> x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
>>> x.shape
(2, 3, 1)
>>> x[1:2]
array([[[4],
[5],
[6]]])
- :const:`Ellipsis` expand to the number of ``:`` objects needed to
make a selection tuple of the same length as ``x.ndim``. Only the
first ellipsis is expanded, any others are interpreted as ``:``.
.. admonition:: Example
>>> x[...,0]
array([[1, 2, 3],
[4, 5, 6]])
- Each :const:`newaxis` object in the selection tuple serves to expand
the dimensions of the resulting selection by one unit-length
dimension. The added dimension is the position of the :const:`newaxis`
object in the selection tuple.
.. admonition:: Example
>>> x[:,np.newaxis,:,:].shape
(2, 1, 3, 1)
- An integer, *i*, returns the same values as ``i:i+1``
**except** the dimensionality of the returned object is reduced by
1. In particular, a selection tuple with the *p*-th
element an integer (and all other entries ``:``) returns the
corresponding sub-array with dimension *N - 1*. If *N = 1*
then the returned object is an array scalar. These objects are
explained in :ref:`arrays.scalars`.
- If the selection tuple has all entries ``:`` except the
*p*-th entry which is a slice object ``i:j:k``,
then the returned array has dimension *N* formed by
concatenating the sub-arrays returned by integer indexing of
elements *i*, *i+k*, ..., *i + (m - 1) k < j*,
- Basic slicing with more than one non-``:`` entry in the slicing
tuple, acts like repeated application of slicing using a single
non-``:`` entry, where the non-``:`` entries are successively taken
(with all other non-``:`` entries replaced by ``:``). Thus,
``x[ind1,...,ind2,:]`` acts like ``x[ind1][...,ind2,:]`` under basic
slicing.
.. warning:: The above is **not** true for advanced slicing.
- You may use slicing to set values in the array, but (unlike lists) you
can never grow the array. The size of the value to be set in
``x[obj] = value`` must be (broadcastable) to the same shape as
``x[obj]``.
.. index::
pair: ndarray; view
.. note::
Remember that a slicing tuple can always be constructed as *obj*
and used in the ``x[obj]`` notation. Slice objects can be used in
the construction in place of the ``[start:stop:step]``
notation. For example, ``x[1:10:5,::-1]`` can also be implemented
as ``obj = (slice(1,10,5), slice(None,None,-1)); x[obj]`` . This
can be useful for constructing generic code that works on arrays
of arbitrary dimension.
.. data:: newaxis
The :const:`newaxis` object can be used in the basic slicing syntax
discussed above. :const:`None` can also be used instead of
:const:`newaxis`.
Advanced indexing
-----------------
Advanced indexing is triggered when the selection object, *obj*, is a
non-tuple sequence object, an :class:`ndarray` (of data type integer or bool),
or a tuple with at least one sequence object or ndarray (of data type
integer or bool). There are two types of advanced indexing: integer
and Boolean.
Advanced indexing always returns a *copy* of the data (contrast with
basic slicing that returns a :term:`view`).
Integer
^^^^^^^
Integer indexing allows selection of arbitrary items in the array
based on their *N*-dimensional index. This kind of selection occurs
when advanced indexing is triggered and the selection object is not
an array of data type bool. For the discussion below, when the
selection object is not a tuple, it will be referred to as if it had
been promoted to a 1-tuple, which will be called the selection
tuple. The rules of advanced integer-style indexing are:
- If the length of the selection tuple is larger than *N* an error is raised.
- All sequences and scalars in the selection tuple are converted to
:class:`intp` indexing arrays.
- All selection tuple objects must be convertible to :class:`intp`
arrays, :class:`slice` objects, or the :const:`Ellipsis` object.
- The first :const:`Ellipsis` object will be expanded, and any other
:const:`Ellipsis` objects will be treated as full slice (``:``)
objects. The expanded :const:`Ellipsis` object is replaced with as
many full slice (``:``) objects as needed to make the length of the
selection tuple :math:`N`.
- If the selection tuple is smaller than *N*, then as many ``:``
objects as needed are added to the end of the selection tuple so
that the modified selection tuple has length *N*.
- All the integer indexing arrays must be :ref:`broadcastable
<arrays.broadcasting.broadcastable>` to the same shape.
- The shape of the output (or the needed shape of the object to be used
for setting) is the broadcasted shape.
- After expanding any ellipses and filling out any missing ``:``
objects in the selection tuple, then let :math:`N_t` be the number
of indexing arrays, and let :math:`N_s = N - N_t` be the number of
slice objects. Note that :math:`N_t > 0` (or we wouldn't be doing
advanced integer indexing).
- If :math:`N_s = 0` then the *M*-dimensional result is constructed by
varying the index tuple ``(i_1, ..., i_M)`` over the range
of the result shape and for each value of the index tuple
``(ind_1, ..., ind_M)``::
result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
..., ind_N[i_1, ..., i_M]]
.. admonition:: Example
Suppose the shape of the broadcasted indexing arrays is 3-dimensional
and *N* is 2. Then the result is found by letting *i, j, k* run over
the shape found by broadcasting ``ind_1`` and ``ind_2``, and each
*i, j, k* yields::
result[i,j,k] = x[ind_1[i,j,k], ind_2[i,j,k]]
- If :math:`N_s > 0`, then partial indexing is done. This can be
somewhat mind-boggling to understand, but if you think in terms of
the shapes of the arrays involved, it can be easier to grasp what
happens. In simple cases (*i.e.* one indexing array and *N - 1* slice
objects) it does exactly what you would expect (concatenation of
repeated application of basic slicing). The rule for partial
indexing is that the shape of the result (or the interpreted shape
of the object to be used in setting) is the shape of *x* with the
indexed subspace replaced with the broadcasted indexing subspace. If
the index subspaces are right next to each other, then the
broadcasted indexing space directly replaces all of the indexed
subspaces in *x*. If the indexing subspaces are separated (by slice
objects), then the broadcasted indexing space is first, followed by
the sliced subspace of *x*.
.. admonition:: Example
Suppose ``x.shape`` is (10,20,30) and ``ind`` is a (2,3,4)-shaped
indexing :class:`intp` array, then ``result = x[...,ind,:]`` has
shape (10,2,3,4,30) because the (20,)-shaped subspace has been
replaced with a (2,3,4)-shaped broadcasted indexing subspace. If
we let *i, j, k* loop over the (2,3,4)-shaped subspace then
``result[...,i,j,k,:] = x[...,ind[i,j,k],:]``. This example
produces the same result as :meth:`x.take(ind, axis=-2) <ndarray.take>`.
.. admonition:: Example
Now let ``x.shape`` be (10,20,30,40,50) and suppose ``ind_1``
and ``ind_2`` are broadcastable to the shape (2,3,4). Then
``x[:,ind_1,ind_2]`` has shape (10,2,3,4,40,50) because the
(20,30)-shaped subspace from X has been replaced with the
(2,3,4) subspace from the indices. However,
``x[:,ind_1,:,ind_2]`` has shape (2,3,4,10,30,50) because there
is no unambiguous place to drop in the indexing subspace, thus
it is tacked-on to the beginning. It is always possible to use
:meth:`.transpose() <ndarray.transpose>` to move the subspace
anywhere desired. (Note that this example cannot be replicated
using :func:`take`.)
Boolean
^^^^^^^
This advanced indexing occurs when obj is an array object of Boolean
type (such as may be returned from comparison operators). It is always
equivalent to (but faster than) ``x[obj.nonzero()]`` where, as
described above, :meth:`obj.nonzero() <ndarray.nonzero>` returns a
tuple (of length :attr:`obj.ndim <ndarray.ndim>`) of integer index
arrays showing the :const:`True` elements of *obj*.
The special case when ``obj.ndim == x.ndim`` is worth mentioning. In
this case ``x[obj]`` returns a 1-dimensional array filled with the
elements of *x* corresponding to the :const:`True` values of *obj*.
The search order will be C-style (last index varies the fastest). If
*obj* has :const:`True` values at entries that are outside of the
bounds of *x*, then an index error will be raised.
You can also use Boolean arrays as element of the selection tuple. In
such instances, they will always be interpreted as :meth:`nonzero(obj)
<ndarray.nonzero>` and the equivalent integer indexing will be
done.
.. warning::
The definition of advanced indexing means that ``x[(1,2,3),]`` is
fundamentally different than ``x[(1,2,3)]``. The latter is
equivalent to ``x[1,2,3]`` which will trigger basic selection while
the former will trigger advanced indexing. Be sure to understand
why this is occurs.
Also recognize that ``x[[1,2,3]]`` will trigger advanced indexing,
whereas ``x[[1,2,slice(None)]]`` will trigger basic slicing.
.. _arrays.indexing.rec:
Record Access
-------------
.. seealso:: :ref:`arrays.dtypes`, :ref:`arrays.scalars`
If the :class:`ndarray` object is a record array, *i.e.* its data type
is a :term:`record` data type, the :term:`fields <field>` of the array
can be accessed by indexing the array with strings, dictionary-like.
Indexing ``x['field-name']`` returns a new :term:`view` to the array,
which is of the same shape as *x* (except when the field is a
sub-array) but of data type ``x.dtype['field-name']`` and contains
only the part of the data in the specified field. Also record array
scalars can be "indexed" this way.
If the accessed field is a sub-array, the dimensions of the sub-array
are appended to the shape of the result.
.. admonition:: Example
>>> x = np.zeros((2,2), dtype=[('a', np.int32), ('b', np.float64, (3,3))])
>>> x['a'].shape
(2, 2)
>>> x['a'].dtype
dtype('int32')
>>> x['b'].shape
(2, 2, 3, 3)
>>> x['b'].dtype
dtype('float64')
Flat Iterator indexing
----------------------
:attr:`x.flat <ndarray.flat>` returns an iterator that will iterate
over the entire array (in C-contiguous style with the last index
varying the fastest). This iterator object can also be indexed using
basic slicing or advanced indexing as long as the selection object is
not a tuple. This should be clear from the fact that :attr:`x.flat
<ndarray.flat>` is a 1-dimensional view. It can be used for integer
indexing with 1-dimensional C-style-flat indices. The shape of any
returned array is therefore the shape of the integer indexing object.
.. index::
single: indexing
single: ndarray

View file

@ -1,336 +0,0 @@
.. index::
pair: array; interface
pair: array; protocol
.. _arrays.interface:
*******************
The Array Interface
*******************
.. note::
This page describes the numpy-specific API for accessing the contents of
a numpy array from other C extensions. :pep:`3118` --
:cfunc:`The Revised Buffer Protocol <PyObject_GetBuffer>` introduces
similar, standardized API to Python 2.6 and 3.0 for any extension
module to use. Cython__'s buffer array support
uses the :pep:`3118` API; see the `Cython numpy
tutorial`__. Cython provides a way to write code that supports the buffer
protocol with Python versions older than 2.6 because it has a
backward-compatible implementation utilizing the legacy array interface
described here.
__ http://cython.org/
__ http://wiki.cython.org/tutorials/numpy
:version: 3
The array interface (sometimes called array protocol) was created in
2005 as a means for array-like Python objects to re-use each other's
data buffers intelligently whenever possible. The homogeneous
N-dimensional array interface is a default mechanism for objects to
share N-dimensional array memory and information. The interface
consists of a Python-side and a C-side using two attributes. Objects
wishing to be considered an N-dimensional array in application code
should support at least one of these attributes. Objects wishing to
support an N-dimensional array in application code should look for at
least one of these attributes and use the information provided
appropriately.
This interface describes homogeneous arrays in the sense that each
item of the array has the same "type". This type can be very simple
or it can be a quite arbitrary and complicated C-like structure.
There are two ways to use the interface: A Python side and a C-side.
Both are separate attributes.
Python side
===========
This approach to the interface consists of the object having an
:data:`__array_interface__` attribute.
.. data:: __array_interface__
A dictionary of items (3 required and 5 optional). The optional
keys in the dictionary have implied defaults if they are not
provided.
The keys are:
**shape** (required)
Tuple whose elements are the array size in each dimension. Each
entry is an integer (a Python int or long). Note that these
integers could be larger than the platform "int" or "long"
could hold (a Python int is a C long). It is up to the code
using this attribute to handle this appropriately; either by
raising an error when overflow is possible, or by using
:cdata:`Py_LONG_LONG` as the C type for the shapes.
**typestr** (required)
A string providing the basic type of the homogenous array The
basic string format consists of 3 parts: a character describing
the byteorder of the data (``<``: little-endian, ``>``:
big-endian, ``|``: not-relevant), a character code giving the
basic type of the array, and an integer providing the number of
bytes the type uses.
The basic type character codes are:
===== ================================================================
``t`` Bit field (following integer gives the number of
bits in the bit field).
``b`` Boolean (integer type where all values are only True or False)
``i`` Integer
``u`` Unsigned integer
``f`` Floating point
``c`` Complex floating point
``O`` Object (i.e. the memory contains a pointer to :ctype:`PyObject`)
``S`` String (fixed-length sequence of char)
``U`` Unicode (fixed-length sequence of :ctype:`Py_UNICODE`)
``V`` Other (void \* -- each item is a fixed-size chunk of memory)
===== ================================================================
**descr** (optional)
A list of tuples providing a more detailed description of the
memory layout for each item in the homogeneous array. Each
tuple in the list has two or three elements. Normally, this
attribute would be used when *typestr* is ``V[0-9]+``, but this is
not a requirement. The only requirement is that the number of
bytes represented in the *typestr* key is the same as the total
number of bytes represented here. The idea is to support
descriptions of C-like structs (records) that make up array
elements. The elements of each tuple in the list are
1. A string providing a name associated with this portion of
the record. This could also be a tuple of ``('full name',
'basic_name')`` where basic name would be a valid Python
variable name representing the full name of the field.
2. Either a basic-type description string as in *typestr* or
another list (for nested records)
3. An optional shape tuple providing how many times this part
of the record should be repeated. No repeats are assumed
if this is not given. Very complicated structures can be
described using this generic interface. Notice, however,
that each element of the array is still of the same
data-type. Some examples of using this interface are given
below.
**Default**: ``[('', typestr)]``
**data** (optional)
A 2-tuple whose first argument is an integer (a long integer
if necessary) that points to the data-area storing the array
contents. This pointer must point to the first element of
data (in other words any offset is always ignored in this
case). The second entry in the tuple is a read-only flag (true
means the data area is read-only).
This attribute can also be an object exposing the
:cfunc:`buffer interface <PyObject_AsCharBuffer>` which
will be used to share the data. If this key is not present (or
returns :class:`None`), then memory sharing will be done
through the buffer interface of the object itself. In this
case, the offset key can be used to indicate the start of the
buffer. A reference to the object exposing the array interface
must be stored by the new object if the memory area is to be
secured.
**Default**: :const:`None`
**strides** (optional)
Either :const:`None` to indicate a C-style contiguous array or
a Tuple of strides which provides the number of bytes needed
to jump to the next array element in the corresponding
dimension. Each entry must be an integer (a Python
:const:`int` or :const:`long`). As with shape, the values may
be larger than can be represented by a C "int" or "long"; the
calling code should handle this appropiately, either by
raising an error, or by using :ctype:`Py_LONG_LONG` in C. The
default is :const:`None` which implies a C-style contiguous
memory buffer. In this model, the last dimension of the array
varies the fastest. For example, the default strides tuple
for an object whose array entries are 8 bytes long and whose
shape is (10,20,30) would be (4800, 240, 8)
**Default**: :const:`None` (C-style contiguous)
**mask** (optional)
:const:`None` or an object exposing the array interface. All
elements of the mask array should be interpreted only as true
or not true indicating which elements of this array are valid.
The shape of this object should be `"broadcastable"
<arrays.broadcasting.broadcastable>` to the shape of the
original array.
**Default**: :const:`None` (All array values are valid)
**offset** (optional)
An integer offset into the array data region. This can only be
used when data is :const:`None` or returns a :class:`buffer`
object.
**Default**: 0.
**version** (required)
An integer showing the version of the interface (i.e. 3 for
this version). Be careful not to use this to invalidate
objects exposing future versions of the interface.
C-struct access
===============
This approach to the array interface allows for faster access to an
array using only one attribute lookup and a well-defined C-structure.
.. cvar:: __array_struct__
A :ctype:`PyCObject` whose :cdata:`voidptr` member contains a
pointer to a filled :ctype:`PyArrayInterface` structure. Memory
for the structure is dynamically created and the :ctype:`PyCObject`
is also created with an appropriate destructor so the retriever of
this attribute simply has to apply :cfunc:`Py_DECREF()` to the
object returned by this attribute when it is finished. Also,
either the data needs to be copied out, or a reference to the
object exposing this attribute must be held to ensure the data is
not freed. Objects exposing the :obj:`__array_struct__` interface
must also not reallocate their memory if other objects are
referencing them.
The PyArrayInterface structure is defined in ``numpy/ndarrayobject.h``
as::
typedef struct {
int two; /* contains the integer 2 -- simple sanity check */
int nd; /* number of dimensions */
char typekind; /* kind in array --- character code of typestr */
int itemsize; /* size of each element */
int flags; /* flags indicating how the data should be interpreted */
/* must set ARR_HAS_DESCR bit to validate descr */
Py_intptr_t *shape; /* A length-nd array of shape information */
Py_intptr_t *strides; /* A length-nd array of stride information */
void *data; /* A pointer to the first element of the array */
PyObject *descr; /* NULL or data-description (same as descr key
of __array_interface__) -- must set ARR_HAS_DESCR
flag or this will be ignored. */
} PyArrayInterface;
The flags member may consist of 5 bits showing how the data should be
interpreted and one bit showing how the Interface should be
interpreted. The data-bits are :const:`CONTIGUOUS` (0x1),
:const:`FORTRAN` (0x2), :const:`ALIGNED` (0x100), :const:`NOTSWAPPED`
(0x200), and :const:`WRITEABLE` (0x400). A final flag
:const:`ARR_HAS_DESCR` (0x800) indicates whether or not this structure
has the arrdescr field. The field should not be accessed unless this
flag is present.
.. admonition:: New since June 16, 2006:
In the past most implementations used the "desc" member of the
:ctype:`PyCObject` itself (do not confuse this with the "descr" member of
the :ctype:`PyArrayInterface` structure above --- they are two separate
things) to hold the pointer to the object exposing the interface.
This is now an explicit part of the interface. Be sure to own a
reference to the object when the :ctype:`PyCObject` is created using
:ctype:`PyCObject_FromVoidPtrAndDesc`.
Type description examples
=========================
For clarity it is useful to provide some examples of the type
description and corresponding :data:`__array_interface__` 'descr'
entries. Thanks to Scott Gilbert for these examples:
In every case, the 'descr' key is optional, but of course provides
more information which may be important for various applications::
* Float data
typestr == '>f4'
descr == [('','>f4')]
* Complex double
typestr == '>c8'
descr == [('real','>f4'), ('imag','>f4')]
* RGB Pixel data
typestr == '|V3'
descr == [('r','|u1'), ('g','|u1'), ('b','|u1')]
* Mixed endian (weird but could happen).
typestr == '|V8' (or '>u8')
descr == [('big','>i4'), ('little','<i4')]
* Nested structure
struct {
int ival;
struct {
unsigned short sval;
unsigned char bval;
unsigned char cval;
} sub;
}
typestr == '|V8' (or '<u8' if you want)
descr == [('ival','<i4'), ('sub', [('sval','<u2'), ('bval','|u1'), ('cval','|u1') ]) ]
* Nested array
struct {
int ival;
double data[16*4];
}
typestr == '|V516'
descr == [('ival','>i4'), ('data','>f8',(16,4))]
* Padded structure
struct {
int ival;
double dval;
}
typestr == '|V16'
descr == [('ival','>i4'),('','|V4'),('dval','>f8')]
It should be clear that any record type could be described using this interface.
Differences with Array interface (Version 2)
============================================
The version 2 interface was very similar. The differences were
largely asthetic. In particular:
1. The PyArrayInterface structure had no descr member at the end
(and therefore no flag ARR_HAS_DESCR)
2. The desc member of the PyCObject returned from __array_struct__ was
not specified. Usually, it was the object exposing the array (so
that a reference to it could be kept and destroyed when the
C-object was destroyed). Now it must be a tuple whose first
element is a string with "PyArrayInterface Version #" and whose
second element is the object exposing the array.
3. The tuple returned from __array_interface__['data'] used to be a
hex-string (now it is an integer or a long integer).
4. There was no __array_interface__ attribute instead all of the keys
(except for version) in the __array_interface__ dictionary were
their own attribute: Thus to obtain the Python-side information you
had to access separately the attributes:
* __array_data__
* __array_shape__
* __array_strides__
* __array_typestr__
* __array_descr__
* __array_offset__
* __array_mask__

View file

@ -1,567 +0,0 @@
.. _arrays.ndarray:
******************************************
The N-dimensional array (:class:`ndarray`)
******************************************
.. currentmodule:: numpy
An :class:`ndarray` is a (usually fixed-size) multidimensional
container of items of the same type and size. The number of dimensions
and items in an array is defined by its :attr:`shape <ndarray.shape>`,
which is a :class:`tuple` of *N* positive integers that specify the
sizes of each dimension. The type of items in the array is specified by
a separate :ref:`data-type object (dtype) <arrays.dtypes>`, one of which
is associated with each ndarray.
As with other container objects in Python, the contents of an
:class:`ndarray` can be accessed and modified by :ref:`indexing or
slicing <arrays.indexing>` the array (using, for example, *N* integers),
and via the methods and attributes of the :class:`ndarray`.
.. index:: view, base
Different :class:`ndarrays <ndarray>` can share the same data, so that
changes made in one :class:`ndarray` may be visible in another. That
is, an ndarray can be a *"view"* to another ndarray, and the data it
is referring to is taken care of by the *"base"* ndarray. ndarrays can
also be views to memory owned by Python :class:`strings <str>` or
objects implementing the :class:`buffer` or :ref:`array
<arrays.interface>` interfaces.
.. admonition:: Example
A 2-dimensional array of size 2 x 3, composed of 4-byte integer
elements:
>>> x = np.array([[1, 2, 3], [4, 5, 6]], np.int32)
>>> type(x)
<type 'numpy.ndarray'>
>>> x.shape
(2, 3)
>>> x.dtype
dtype('int32')
The array can be indexed using Python container-like syntax:
>>> x[1,2] # i.e., the element of x in the *second* row, *third*
column, namely, 6.
For example :ref:`slicing <arrays.indexing>` can produce views of
the array:
>>> y = x[:,1]
>>> y
array([2, 5])
>>> y[0] = 9 # this also changes the corresponding element in x
>>> y
array([9, 5])
>>> x
array([[1, 9, 3],
[4, 5, 6]])
Constructing arrays
===================
New arrays can be constructed using the routines detailed in
:ref:`routines.array-creation`, and also by using the low-level
:class:`ndarray` constructor:
.. autosummary::
:toctree: generated/
ndarray
.. _arrays.ndarray.indexing:
Indexing arrays
===============
Arrays can be indexed using an extended Python slicing syntax,
``array[selection]``. Similar syntax is also used for accessing
fields in a :ref:`record array <arrays.dtypes>`.
.. seealso:: :ref:`Array Indexing <arrays.indexing>`.
Internal memory layout of an ndarray
====================================
An instance of class :class:`ndarray` consists of a contiguous
one-dimensional segment of computer memory (owned by the array, or by
some other object), combined with an indexing scheme that maps *N*
integers into the location of an item in the block. The ranges in
which the indices can vary is specified by the :obj:`shape
<ndarray.shape>` of the array. How many bytes each item takes and how
the bytes are interpreted is defined by the :ref:`data-type object
<arrays.dtypes>` associated with the array.
.. index:: C-order, Fortran-order, row-major, column-major, stride,
offset
A segment of memory is inherently 1-dimensional, and there are many
different schemes for arranging the items of an *N*-dimensional array
in a 1-dimensional block. Numpy is flexible, and :class:`ndarray`
objects can accommodate any *strided indexing scheme*. In a strided
scheme, the N-dimensional index :math:`(n_0, n_1, ..., n_{N-1})`
corresponds to the offset (in bytes):
.. math:: n_{\mathrm{offset}} = \sum_{k=0}^{N-1} s_k n_k
from the beginning of the memory block associated with the
array. Here, :math:`s_k` are integers which specify the :obj:`strides
<ndarray.strides>` of the array. The :term:`column-major` order (used,
for example, in the Fortran language and in *Matlab*) and
:term:`row-major` order (used in C) schemes are just specific kinds of
strided scheme, and correspond to the strides:
.. math::
s_k^{\mathrm{column}} = \prod_{j=0}^{k-1} d_j ,
\quad s_k^{\mathrm{row}} = \prod_{j=k+1}^{N-1} d_j .
.. index:: single-segment, contiguous, non-contiguous
where :math:`d_j` = `self.itemsize * self.shape[j]`.
Both the C and Fortran orders are :term:`contiguous`, *i.e.,*
:term:`single-segment`, memory layouts, in which every part of the
memory block can be accessed by some combination of the indices.
Data in new :class:`ndarrays <ndarray>` is in the :term:`row-major`
(C) order, unless otherwise specified, but, for example, :ref:`basic
array slicing <arrays.indexing>` often produces :term:`views <view>`
in a different scheme.
.. seealso: :ref:`Indexing <arrays.ndarray.indexing>`_
.. note::
Several algorithms in NumPy work on arbitrarily strided arrays.
However, some algorithms require single-segment arrays. When an
irregularly strided array is passed in to such algorithms, a copy
is automatically made.
.. _arrays.ndarray.attributes:
Array attributes
================
Array attributes reflect information that is intrinsic to the array
itself. Generally, accessing an array through its attributes allows
you to get and sometimes set intrinsic properties of the array without
creating a new array. The exposed attributes are the core parts of an
array and only some of them can be reset meaningfully without creating
a new array. Information on each attribute is given below.
Memory layout
-------------
The following attributes contain information about the memory layout
of the array:
.. autosummary::
:toctree: generated/
ndarray.flags
ndarray.shape
ndarray.strides
ndarray.ndim
ndarray.data
ndarray.size
ndarray.itemsize
ndarray.nbytes
ndarray.base
Data type
---------
.. seealso:: :ref:`Data type objects <arrays.dtypes>`
The data type object associated with the array can be found in the
:attr:`dtype <ndarray.dtype>` attribute:
.. autosummary::
:toctree: generated/
ndarray.dtype
Other attributes
----------------
.. autosummary::
:toctree: generated/
ndarray.T
ndarray.real
ndarray.imag
ndarray.flat
ndarray.ctypes
__array_priority__
.. _arrays.ndarray.array-interface:
Array interface
---------------
.. seealso:: :ref:`arrays.interface`.
========================== ===================================
:obj:`__array_interface__` Python-side of the array interface
:obj:`__array_struct__` C-side of the array interface
========================== ===================================
:mod:`ctypes` foreign function interface
----------------------------------------
.. autosummary::
:toctree: generated/
ndarray.ctypes
.. _array.ndarray.methods:
Array methods
=============
An :class:`ndarray` object has many methods which operate on or with
the array in some fashion, typically returning an array result. These
methods are briefly explained below. (Each method's docstring has a
more complete description.)
For the following methods there are also corresponding functions in
:mod:`numpy`: :func:`all`, :func:`any`, :func:`argmax`,
:func:`argmin`, :func:`argsort`, :func:`choose`, :func:`clip`,
:func:`compress`, :func:`copy`, :func:`cumprod`, :func:`cumsum`,
:func:`diagonal`, :func:`imag`, :func:`max <amax>`, :func:`mean`,
:func:`min <amin>`, :func:`nonzero`, :func:`prod`, :func:`ptp`,
:func:`put`, :func:`ravel`, :func:`real`, :func:`repeat`,
:func:`reshape`, :func:`round <around>`, :func:`searchsorted`,
:func:`sort`, :func:`squeeze`, :func:`std`, :func:`sum`,
:func:`swapaxes`, :func:`take`, :func:`trace`, :func:`transpose`,
:func:`var`.
Array conversion
----------------
.. autosummary::
:toctree: generated/
ndarray.item
ndarray.tolist
ndarray.itemset
ndarray.tostring
ndarray.tofile
ndarray.dump
ndarray.dumps
ndarray.astype
ndarray.byteswap
ndarray.copy
ndarray.view
ndarray.getfield
ndarray.setflags
ndarray.fill
Shape manipulation
------------------
For reshape, resize, and transpose, the single tuple argument may be
replaced with ``n`` integers which will be interpreted as an n-tuple.
.. autosummary::
:toctree: generated/
ndarray.reshape
ndarray.resize
ndarray.transpose
ndarray.swapaxes
ndarray.flatten
ndarray.ravel
ndarray.squeeze
Item selection and manipulation
-------------------------------
For array methods that take an *axis* keyword, it defaults to
:const:`None`. If axis is *None*, then the array is treated as a 1-D
array. Any other value for *axis* represents the dimension along which
the operation should proceed.
.. autosummary::
:toctree: generated/
ndarray.take
ndarray.put
ndarray.repeat
ndarray.choose
ndarray.sort
ndarray.argsort
ndarray.searchsorted
ndarray.nonzero
ndarray.compress
ndarray.diagonal
Calculation
-----------
.. index:: axis
Many of these methods take an argument named *axis*. In such cases,
- If *axis* is *None* (the default), the array is treated as a 1-D
array and the operation is performed over the entire array. This
behavior is also the default if self is a 0-dimensional array or
array scalar. (An array scalar is an instance of the types/classes
float32, float64, etc., whereas a 0-dimensional array is an ndarray
instance containing precisely one array scalar.)
- If *axis* is an integer, then the operation is done over the given
axis (for each 1-D subarray that can be created along the given axis).
.. admonition:: Example of the *axis* argument
A 3-dimensional array of size 3 x 3 x 3, summed over each of its
three axes
>>> x
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
>>> x.sum(axis=0)
array([[27, 30, 33],
[36, 39, 42],
[45, 48, 51]])
>>> # for sum, axis is the first keyword, so we may omit it,
>>> # specifying only its value
>>> x.sum(0), x.sum(1), x.sum(2)
(array([[27, 30, 33],
[36, 39, 42],
[45, 48, 51]]),
array([[ 9, 12, 15],
[36, 39, 42],
[63, 66, 69]]),
array([[ 3, 12, 21],
[30, 39, 48],
[57, 66, 75]]))
The parameter *dtype* specifies the data type over which a reduction
operation (like summing) should take place. The default reduce data
type is the same as the data type of *self*. To avoid overflow, it can
be useful to perform the reduction using a larger data type.
For several methods, an optional *out* argument can also be provided
and the result will be placed into the output array given. The *out*
argument must be an :class:`ndarray` and have the same number of
elements. It can have a different data type in which case casting will
be performed.
.. autosummary::
:toctree: generated/
ndarray.argmax
ndarray.min
ndarray.argmin
ndarray.ptp
ndarray.clip
ndarray.conj
ndarray.round
ndarray.trace
ndarray.sum
ndarray.cumsum
ndarray.mean
ndarray.var
ndarray.std
ndarray.prod
ndarray.cumprod
ndarray.all
ndarray.any
Arithmetic and comparison operations
====================================
.. index:: comparison, arithmetic, operation, operator
Arithmetic and comparison operations on :class:`ndarrays <ndarray>`
are defined as element-wise operations, and generally yield
:class:`ndarray` objects as results.
Each of the arithmetic operations (``+``, ``-``, ``*``, ``/``, ``//``,
``%``, ``divmod()``, ``**`` or ``pow()``, ``<<``, ``>>``, ``&``,
``^``, ``|``, ``~``) and the comparisons (``==``, ``<``, ``>``,
``<=``, ``>=``, ``!=``) is equivalent to the corresponding
:term:`universal function` (or :term:`ufunc` for short) in Numpy. For
more information, see the section on :ref:`Universal Functions
<ufuncs>`.
Comparison operators:
.. autosummary::
:toctree: generated/
ndarray.__lt__
ndarray.__le__
ndarray.__gt__
ndarray.__ge__
ndarray.__eq__
ndarray.__ne__
Truth value of an array (:func:`bool()`):
.. autosummary::
:toctree: generated/
ndarray.__nonzero__
.. note::
Truth-value testing of an array invokes
:meth:`ndarray.__nonzero__`, which raises an error if the number of
elements in the the array is larger than 1, because the truth value
of such arrays is ambiguous. Use :meth:`.any() <ndarray.any>` and
:meth:`.all() <ndarray.all>` instead to be clear about what is meant
in such cases. (If the number of elements is 0, the array evaluates
to ``False``.)
Unary operations:
.. autosummary::
:toctree: generated/
ndarray.__neg__
ndarray.__pos__
ndarray.__abs__
ndarray.__invert__
Arithmetic:
.. autosummary::
:toctree: generated/
ndarray.__add__
ndarray.__sub__
ndarray.__mul__
ndarray.__div__
ndarray.__truediv__
ndarray.__floordiv__
ndarray.__mod__
ndarray.__divmod__
ndarray.__pow__
ndarray.__lshift__
ndarray.__rshift__
ndarray.__and__
ndarray.__or__
ndarray.__xor__
.. note::
- Any third argument to :func:`pow()` is silently ignored,
as the underlying :func:`ufunc <power>` takes only two arguments.
- The three division operators are all defined; :obj:`div` is active
by default, :obj:`truediv` is active when
:obj:`__future__` division is in effect.
- Because :class:`ndarray` is a built-in type (written in C), the
``__r{op}__`` special methods are not directly defined.
- The functions called to implement many arithmetic special methods
for arrays can be modified using :func:`set_numeric_ops`.
Arithmetic, in-place:
.. autosummary::
:toctree: generated/
ndarray.__iadd__
ndarray.__isub__
ndarray.__imul__
ndarray.__idiv__
ndarray.__itruediv__
ndarray.__ifloordiv__
ndarray.__imod__
ndarray.__ipow__
ndarray.__ilshift__
ndarray.__irshift__
ndarray.__iand__
ndarray.__ior__
ndarray.__ixor__
.. warning::
In place operations will perform the calculation using the
precision decided by the data type of the two operands, but will
silently downcast the result (if necessary) so it can fit back into
the array. Therefore, for mixed precision calculations, ``A {op}=
B`` can be different than ``A = A {op} B``. For example, suppose
``a = ones((3,3))``. Then, ``a += 3j`` is different than ``a = a +
3j``: while they both perform the same computation, ``a += 3``
casts the result to fit back in ``a``, whereas ``a = a + 3j``
re-binds the name ``a`` to the result.
Special methods
===============
For standard library functions:
.. autosummary::
:toctree: generated/
ndarray.__copy__
ndarray.__deepcopy__
ndarray.__reduce__
ndarray.__setstate__
Basic customization:
.. autosummary::
:toctree: generated/
ndarray.__new__
ndarray.__array__
ndarray.__array_wrap__
Container customization: (see :ref:`Indexing <arrays.indexing>`)
.. autosummary::
:toctree: generated/
ndarray.__len__
ndarray.__getitem__
ndarray.__setitem__
ndarray.__getslice__
ndarray.__setslice__
ndarray.__contains__
Conversion; the operations :func:`complex()`, :func:`int()`,
:func:`long()`, :func:`float()`, :func:`oct()`, and
:func:`hex()`. They work only on arrays that have one element in them
and return the appropriate scalar.
.. autosummary::
:toctree: generated/
ndarray.__int__
ndarray.__long__
ndarray.__float__
ndarray.__oct__
ndarray.__hex__
String representations:
.. autosummary::
:toctree: generated/
ndarray.__str__
ndarray.__repr__

View file

@ -1,47 +0,0 @@
.. _arrays:
*************
Array objects
*************
.. currentmodule:: numpy
NumPy provides an N-dimensional array type, the :ref:`ndarray
<arrays.ndarray>`, which describes a collection of "items" of the same
type. The items can be :ref:`indexed <arrays.indexing>` using for
example N integers.
All ndarrays are :term:`homogenous`: every item takes up the same size
block of memory, and all blocks are interpreted in exactly the same
way. How each item in the array is to be interpreted is specified by a
separate :ref:`data-type object <arrays.dtypes>`, one of which is associated
with every array. In addition to basic types (integers, floats,
*etc.*), the data type objects can also represent data structures.
An item extracted from an array, *e.g.*, by indexing, is represented
by a Python object whose type is one of the :ref:`array scalar types
<arrays.scalars>` built in Numpy. The array scalars allow easy manipulation
of also more complicated arrangements of data.
.. figure:: figures/threefundamental.png
**Figure**
Conceptual diagram showing the relationship between the three
fundamental objects used to describe the data in an array: 1) the
ndarray itself, 2) the data-type object that describes the layout
of a single fixed-size element of the array, 3) the array-scalar
Python object that is returned when a single element of the array
is accessed.
.. toctree::
:maxdepth: 2
arrays.ndarray
arrays.scalars
arrays.dtypes
arrays.indexing
arrays.classes
maskedarray
arrays.interface

View file

@ -1,284 +0,0 @@
.. _arrays.scalars:
*******
Scalars
*******
.. currentmodule:: numpy
Python defines only one type of a particular data class (there is only
one integer type, one floating-point type, etc.). This can be
convenient in applications that don't need to be concerned with all
the ways data can be represented in a computer. For scientific
computing, however, more control is often needed.
In NumPy, there are 21 new fundamental Python types to describe
different types of scalars. These type descriptors are mostly based on
the types available in the C language that CPython is written in, with
several additional types compatible with Python's types.
Array scalars have the same attributes and methods as :class:`ndarrays
<ndarray>`. [#]_ This allows one to treat items of an array partly on
the same footing as arrays, smoothing out rough edges that result when
mixing scalar and array operations.
Array scalars live in a hierarchy (see the Figure below) of data
types. They can be detected using the hierarchy: For example,
``isinstance(val, np.generic)`` will return :const:`True` if *val* is
an array scalar object. Alternatively, what kind of array scalar is
present can be determined using other members of the data type
hierarchy. Thus, for example ``isinstance(val, np.complexfloating)``
will return :const:`True` if *val* is a complex valued type, while
:const:`isinstance(val, np.flexible)` will return true if *val* is one
of the flexible itemsize array types (:class:`string`,
:class:`unicode`, :class:`void`).
.. figure:: figures/dtype-hierarchy.png
**Figure:** Hierarchy of type objects representing the array data
types. Not shown are the two integer types :class:`intp` and
:class:`uintp` which just point to the integer type that holds a
pointer for the platform. All the number types can be obtained
using bit-width names as well.
.. [#] However, array scalars are immutable, so none of the array
scalar attributes are settable.
.. _arrays.scalars.character-codes:
.. _arrays.scalars.built-in:
Built-in scalar types
=====================
The built-in scalar types are shown below. Along with their (mostly)
C-derived names, the integer, float, and complex data-types are also
available using a bit-width convention so that an array of the right
size can always be ensured (e.g. :class:`int8`, :class:`float64`,
:class:`complex128`). Two aliases (:class:`intp` and :class:`uintp`)
pointing to the integer type that is sufficiently large to hold a C pointer
are also provided. The C-like names are associated with character codes,
which are shown in the table. Use of the character codes, however,
is discouraged.
Five of the scalar types are essentially equivalent to fundamental
Python types and therefore inherit from them as well as from the
generic array scalar type:
==================== ====================
Array scalar type Related Python type
==================== ====================
:class:`int_` :class:`IntType`
:class:`float_` :class:`FloatType`
:class:`complex_` :class:`ComplexType`
:class:`str_` :class:`StringType`
:class:`unicode_` :class:`UnicodeType`
==================== ====================
The :class:`bool_` data type is very similar to the Python
:class:`BooleanType` but does not inherit from it because Python's
:class:`BooleanType` does not allow itself to be inherited from, and
on the C-level the size of the actual bool data is not the same as a
Python Boolean scalar.
.. warning::
The :class:`bool_` type is not a subclass of the :class:`int_` type
(the :class:`bool_` is not even a number type). This is different
than Python's default implementation of :class:`bool` as a
sub-class of int.
.. tip:: The default data type in Numpy is :class:`float_`.
In the tables below, ``platform?`` means that the type may not be
available on all platforms. Compatibility with different C or Python
types is indicated: two types are compatible if their data is of the
same size and interpreted in the same way.
Booleans:
=================== ============================= ===============
Type Remarks Character code
=================== ============================= ===============
:class:`bool_` compatible: Python bool ``'?'``
:class:`bool8` 8 bits
=================== ============================= ===============
Integers:
=================== ============================= ===============
:class:`byte` compatible: C char ``'b'``
:class:`short` compatible: C short ``'h'``
:class:`intc` compatible: C int ``'i'``
:class:`int_` compatible: Python int ``'l'``
:class:`longlong` compatible: C long long ``'q'``
:class:`intp` large enough to fit a pointer ``'p'``
:class:`int8` 8 bits
:class:`int16` 16 bits
:class:`int32` 32 bits
:class:`int64` 64 bits
=================== ============================= ===============
Unsigned integers:
=================== ============================= ===============
:class:`ubyte` compatible: C unsigned char ``'B'``
:class:`ushort` compatible: C unsigned short ``'H'``
:class:`uintc` compatible: C unsigned int ``'I'``
:class:`uint` compatible: Python int ``'L'``
:class:`ulonglong` compatible: C long long ``'Q'``
:class:`uintp` large enough to fit a pointer ``'P'``
:class:`uint8` 8 bits
:class:`uint16` 16 bits
:class:`uint32` 32 bits
:class:`uint64` 64 bits
=================== ============================= ===============
Floating-point numbers:
=================== ============================= ===============
:class:`single` compatible: C float ``'f'``
:class:`double` compatible: C double
:class:`float_` compatible: Python float ``'d'``
:class:`longfloat` compatible: C long float ``'g'``
:class:`float32` 32 bits
:class:`float64` 64 bits
:class:`float96` 96 bits, platform?
:class:`float128` 128 bits, platform?
=================== ============================= ===============
Complex floating-point numbers:
=================== ============================= ===============
:class:`csingle` ``'F'``
:class:`complex_` compatible: Python complex ``'D'``
:class:`clongfloat` ``'G'``
:class:`complex64` two 32-bit floats
:class:`complex128` two 64-bit floats
:class:`complex192` two 96-bit floats,
platform?
:class:`complex256` two 128-bit floats,
platform?
=================== ============================= ===============
Any Python object:
=================== ============================= ===============
:class:`object_` any Python object ``'O'``
=================== ============================= ===============
.. note::
The data actually stored in :term:`object arrays <object array>`
(*i.e.*, arrays having dtype :class:`object_`) are references to
Python objects, not the objects themselves. Hence, object arrays
behave more like usual Python :class:`lists <list>`, in the sense
that their contents need not be of the same Python type.
The object type is also special because an array containing
:class:`object_` items does not return an :class:`object_` object
on item access, but instead returns the actual object that
the array item refers to.
The following data types are :term:`flexible`. They have no predefined
size: the data they describe can be of different length in different
arrays. (In the character codes ``#`` is an integer denoting how many
elements the data type consists of.)
=================== ============================= ========
:class:`str_` compatible: Python str ``'S#'``
:class:`unicode_` compatible: Python unicode ``'U#'``
:class:`void` ``'V#'``
=================== ============================= ========
.. warning::
Numeric Compatibility: If you used old typecode characters in your
Numeric code (which was never recommended), you will need to change
some of them to the new characters. In particular, the needed
changes are ``c -> S1``, ``b -> B``, ``1 -> b``, ``s -> h``, ``w ->
H``, and ``u -> I``. These changes make the type character
convention more consistent with other Python modules such as the
:mod:`struct` module.
Attributes
==========
The array scalar objects have an :obj:`array priority
<__array_priority__>` of :cdata:`NPY_SCALAR_PRIORITY`
(-1,000,000.0). They also do not (yet) have a :attr:`ctypes <ndarray.ctypes>`
attribute. Otherwise, they share the same attributes as arrays:
.. autosummary::
:toctree: generated/
generic.flags
generic.shape
generic.strides
generic.ndim
generic.data
generic.size
generic.itemsize
generic.base
generic.dtype
generic.real
generic.imag
generic.flat
generic.T
generic.__array_interface__
generic.__array_struct__
generic.__array_priority__
generic.__array_wrap__
Indexing
========
.. seealso:: :ref:`arrays.indexing`, :ref:`arrays.dtypes`
Array scalars can be indexed like 0-dimensional arrays: if *x* is an
array scalar,
- ``x[()]`` returns a 0-dimensional :class:`ndarray`
- ``x['field-name']`` returns the array scalar in the field *field-name*.
(*x* can have fields, for example, when it corresponds to a record data type.)
Methods
=======
Array scalars have exactly the same methods as arrays. The default
behavior of these methods is to internally convert the scalar to an
equivalent 0-dimensional array and to call the corresponding array
method. In addition, math operations on array scalars are defined so
that the same hardware flags are set and used to interpret the results
as for :ref:`ufunc <ufuncs>`, so that the error state used for ufuncs
also carries over to the math on array scalars.
The exceptions to the above rules are given below:
.. autosummary::
:toctree: generated/
generic
generic.__array__
generic.__array_wrap__
generic.__squeeze__
generic.byteswap
generic.__reduce__
generic.__setstate__
generic.setflags
Defining new types
==================
There are two ways to effectively define a new array scalar type
(apart from composing record :ref:`dtypes <arrays.dtypes>` from the built-in
scalar types): One way is to simply subclass the :class:`ndarray` and
overwrite the methods of interest. This will work to a degree, but
internally certain behaviors are fixed by the data type of the array.
To fully customize the data type of an array you need to define a new
data-type, and register it with NumPy. Such new types can only be
defined in C, using the :ref:`Numpy C-API <c-api>`.

File diff suppressed because it is too large Load diff

View file

@ -1,104 +0,0 @@
System configuration
====================
.. sectionauthor:: Travis E. Oliphant
When NumPy is built, information about system configuration is
recorded, and is made available for extension modules using Numpy's C
API. These are mostly defined in ``numpyconfig.h`` (included in
``ndarrayobject.h``). The public symbols are prefixed by ``NPY_*``.
Numpy also offers some functions for querying information about the
platform in use.
For private use, Numpy also constructs a ``config.h`` in the NumPy
include directory, which is not exported by Numpy (that is a python
extension which use the numpy C API will not see those symbols), to
avoid namespace pollution.
Data type sizes
---------------
The :cdata:`NPY_SIZEOF_{CTYPE}` constants are defined so that sizeof
information is available to the pre-processor.
.. cvar:: NPY_SIZEOF_SHORT
sizeof(short)
.. cvar:: NPY_SIZEOF_INT
sizeof(int)
.. cvar:: NPY_SIZEOF_LONG
sizeof(long)
.. cvar:: NPY_SIZEOF_LONG_LONG
sizeof(longlong) where longlong is defined appropriately on the
platform (A macro defines **NPY_SIZEOF_LONGLONG** as well.)
.. cvar:: NPY_SIZEOF_PY_LONG_LONG
.. cvar:: NPY_SIZEOF_FLOAT
sizeof(float)
.. cvar:: NPY_SIZEOF_DOUBLE
sizeof(double)
.. cvar:: NPY_SIZEOF_LONG_DOUBLE
sizeof(longdouble) (A macro defines **NPY_SIZEOF_LONGDOUBLE** as well.)
.. cvar:: NPY_SIZEOF_PY_INTPTR_T
Size of a pointer on this platform (sizeof(void \*)) (A macro defines
NPY_SIZEOF_INTP as well.)
Platform information
--------------------
.. cvar:: NPY_CPU_X86
.. cvar:: NPY_CPU_AMD64
.. cvar:: NPY_CPU_IA64
.. cvar:: NPY_CPU_PPC
.. cvar:: NPY_CPU_PPC64
.. cvar:: NPY_CPU_SPARC
.. cvar:: NPY_CPU_SPARC64
.. cvar:: NPY_CPU_S390
.. cvar:: NPY_CPU_PARISC
.. versionadded:: 1.3.0
CPU architecture of the platform; only one of the above is
defined.
Defined in ``numpy/npy_cpu.h``
.. cvar:: NPY_LITTLE_ENDIAN
.. cvar:: NPY_BIG_ENDIAN
.. cvar:: NPY_BYTE_ORDER
.. versionadded:: 1.3.0
Portable alternatives to the ``endian.h`` macros of GNU Libc.
If big endian, :cdata:`NPY_BYTE_ORDER` == :cdata:`NPY_BIG_ENDIAN`, and
similarly for little endian architectures.
Defined in ``numpy/npy_endian.h``.
.. cfunction:: PyArray_GetEndianness()
.. versionadded:: 1.3.0
Returns the endianness of the current platform.
One of :cdata:`NPY_CPU_BIG`, :cdata:`NPY_CPU_LITTLE`,
or :cdata:`NPY_CPU_UNKNOWN_ENDIAN`.

View file

@ -1,183 +0,0 @@
Numpy core libraries
====================
.. sectionauthor:: David Cournapeau
.. versionadded:: 1.3.0
Starting from numpy 1.3.0, we are working on separating the pure C,
"computational" code from the python dependent code. The goal is twofolds:
making the code cleaner, and enabling code reuse by other extensions outside
numpy (scipy, etc...).
Numpy core math library
-----------------------
The numpy core math library ('npymath') is a first step in this direction. This
library contains most math-related C99 functionality, which can be used on
platforms where C99 is not well supported. The core math functions have the
same API as the C99 ones, except for the npy_* prefix.
The available functions are defined in npy_math.h - please refer to this header
in doubt.
Floating point classification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. cvar:: NPY_NAN
This macro is defined to a NaN (Not a Number), and is guaranteed to have
the signbit unset ('positive' NaN). The corresponding single and extension
precision macro are available with the suffix F and L.
.. cvar:: NPY_INFINITY
This macro is defined to a positive inf. The corresponding single and
extension precision macro are available with the suffix F and L.
.. cvar:: NPY_PZERO
This macro is defined to positive zero. The corresponding single and
extension precision macro are available with the suffix F and L.
.. cvar:: NPY_NZERO
This macro is defined to negative zero (that is with the sign bit set). The
corresponding single and extension precision macro are available with the
suffix F and L.
.. cfunction:: int npy_isnan(x)
This is a macro, and is equivalent to C99 isnan: works for single, double
and extended precision, and return a non 0 value is x is a NaN.
.. cfunction:: int npy_isfinite(x)
This is a macro, and is equivalent to C99 isfinite: works for single,
double and extended precision, and return a non 0 value is x is neither a
NaN or a infinity.
.. cfunction:: int npy_isinf(x)
This is a macro, and is equivalent to C99 isinf: works for single, double
and extended precision, and return a non 0 value is x is infinite (positive
and negative).
.. cfunction:: int npy_signbit(x)
This is a macro, and is equivalent to C99 signbit: works for single, double
and extended precision, and return a non 0 value is x has the signbit set
(that is the number is negative).
.. cfunction:: double npy_copysign(double x, double y)
This is a function equivalent to C99 copysign: return x with the same sign
as y. Works for any value, including inf and nan. Single and extended
precisions are available with suffix f and l.
.. versionadded:: 1.4.0
Useful math constants
~~~~~~~~~~~~~~~~~~~~~
The following math constants are available in npy_math.h. Single and extended
precision are also available by adding the F and L suffixes respectively.
.. cvar:: NPY_E
Base of natural logarithm (:math:`e`)
.. cvar:: NPY_LOG2E
Logarithm to base 2 of the Euler constant (:math:`\frac{\ln(e)}{\ln(2)}`)
.. cvar:: NPY_LOG10E
Logarithm to base 10 of the Euler constant (:math:`\frac{\ln(e)}{\ln(10)}`)
.. cvar:: NPY_LOGE2
Natural logarithm of 2 (:math:`\ln(2)`)
.. cvar:: NPY_LOGE10
Natural logarithm of 10 (:math:`\ln(10)`)
.. cvar:: NPY_PI
Pi (:math:`\pi`)
.. cvar:: NPY_PI_2
Pi divided by 2 (:math:`\frac{\pi}{2}`)
.. cvar:: NPY_PI_4
Pi divided by 4 (:math:`\frac{\pi}{4}`)
.. cvar:: NPY_1_PI
Reciprocal of pi (:math:`\frac{1}{\pi}`)
.. cvar:: NPY_2_PI
Two times the reciprocal of pi (:math:`\frac{2}{\pi}`)
.. cvar:: NPY_EULER
The Euler constant
:math:`\lim_{n\rightarrow\infty}({\sum_{k=1}^n{\frac{1}{k}}-\ln n})`
Low-level floating point manipulation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Those can be useful for precise floating point comparison.
.. cfunction:: double npy_nextafter(double x, double y)
This is a function equivalent to C99 nextafter: return next representable
floating point value from x in the direction of y. Single and extended
precisions are available with suffix f and l.
.. versionadded:: 1.4.0
.. cfunction:: double npy_spacing(double x)
This is a function equivalent to Fortran intrinsic. Return distance between
x and next representable floating point value from x, e.g. spacing(1) ==
eps. spacing of nan and +/- inf return nan. Single and extended precisions
are available with suffix f and l.
.. versionadded:: 1.4.0
Complex functions
~~~~~~~~~~~~~~~~~
.. versionadded:: 1.4.0
C99-like complex functions have been added. Those can be used if you wish to
implement portable C extensions. Since we still support platforms without C99
complex type, you need to restrict to C90-compatible syntax, e.g.:
.. code-block:: c
/* a = 1 + 2i \*/
npy_complex a = npy_cpack(1, 2);
npy_complex b;
b = npy_log(a);
Linking against the core math library in an extension
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. versionadded:: 1.4.0
To use the core math library in your own extension, you need to add the npymath
compile and link options to your extension in your setup.py:
>>> from numpy.distutils.misc_utils import get_info
>>> info = get_info('npymath')
>>> config.add_extension('foo', sources=['foo.c'], extra_info=**info)
In other words, the usage of info is exactly the same as when using blas_info
and co.

View file

@ -1,218 +0,0 @@
Data Type API
=============
.. sectionauthor:: Travis E. Oliphant
The standard array can have 21 different data types (and has some
support for adding your own types). These data types all have an
enumerated type, an enumerated type-character, and a corresponding
array scalar Python type object (placed in a hierarchy). There are
also standard C typedefs to make it easier to manipulate elements of
the given data type. For the numeric types, there are also bit-width
equivalent C typedefs and named typenumbers that make it easier to
select the precision desired.
.. warning::
The names for the types in c code follows c naming conventions
more closely. The Python names for these types follow Python
conventions. Thus, :cdata:`NPY_FLOAT` picks up a 32-bit float in
C, but :class:`numpy.float_` in Python corresponds to a 64-bit
double. The bit-width names can be used in both Python and C for
clarity.
Enumerated Types
----------------
There is a list of enumerated types defined providing the basic 21
data types plus some useful generic names. Whenever the code requires
a type number, one of these enumerated types is requested. The types
are all called :cdata:`NPY_{NAME}` where ``{NAME}`` can be
**BOOL**, **BYTE**, **UBYTE**, **SHORT**, **USHORT**, **INT**,
**UINT**, **LONG**, **ULONG**, **LONGLONG**, **ULONGLONG**,
**FLOAT**, **DOUBLE**, **LONGDOUBLE**, **CFLOAT**, **CDOUBLE**,
**CLONGDOUBLE**, **OBJECT**, **STRING**, **UNICODE**, **VOID**
**NTYPES**, **NOTYPE**, **USERDEF**, **DEFAULT_TYPE**
The various character codes indicating certain types are also part of
an enumerated list. References to type characters (should they be
needed at all) should always use these enumerations. The form of them
is :cdata:`NPY_{NAME}LTR` where ``{NAME}`` can be
**BOOL**, **BYTE**, **UBYTE**, **SHORT**, **USHORT**, **INT**,
**UINT**, **LONG**, **ULONG**, **LONGLONG**, **ULONGLONG**,
**FLOAT**, **DOUBLE**, **LONGDOUBLE**, **CFLOAT**, **CDOUBLE**,
**CLONGDOUBLE**, **OBJECT**, **STRING**, **VOID**
**INTP**, **UINTP**
**GENBOOL**, **SIGNED**, **UNSIGNED**, **FLOATING**, **COMPLEX**
The latter group of ``{NAME}s`` corresponds to letters used in the array
interface typestring specification.
Defines
-------
Max and min values for integers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. cvar:: NPY_MAX_INT{bits}
.. cvar:: NPY_MAX_UINT{bits}
.. cvar:: NPY_MIN_INT{bits}
These are defined for ``{bits}`` = 8, 16, 32, 64, 128, and 256 and provide
the maximum (minimum) value of the corresponding (unsigned) integer
type. Note: the actual integer type may not be available on all
platforms (i.e. 128-bit and 256-bit integers are rare).
.. cvar:: NPY_MIN_{type}
This is defined for ``{type}`` = **BYTE**, **SHORT**, **INT**,
**LONG**, **LONGLONG**, **INTP**
.. cvar:: NPY_MAX_{type}
This is defined for all defined for ``{type}`` = **BYTE**, **UBYTE**,
**SHORT**, **USHORT**, **INT**, **UINT**, **LONG**, **ULONG**,
**LONGLONG**, **ULONGLONG**, **INTP**, **UINTP**
Number of bits in data types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All :cdata:`NPY_SIZEOF_{CTYPE}` constants have corresponding
:cdata:`NPY_BITSOF_{CTYPE}` constants defined. The :cdata:`NPY_BITSOF_{CTYPE}`
constants provide the number of bits in the data type. Specifically,
the available ``{CTYPE}s`` are
**BOOL**, **CHAR**, **SHORT**, **INT**, **LONG**,
**LONGLONG**, **FLOAT**, **DOUBLE**, **LONGDOUBLE**
Bit-width references to enumerated typenums
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All of the numeric data types (integer, floating point, and complex)
have constants that are defined to be a specific enumerated type
number. Exactly which enumerated type a bit-width type refers to is
platform dependent. In particular, the constants available are
:cdata:`PyArray_{NAME}{BITS}` where ``{NAME}`` is **INT**, **UINT**,
**FLOAT**, **COMPLEX** and ``{BITS}`` can be 8, 16, 32, 64, 80, 96, 128,
160, 192, 256, and 512. Obviously not all bit-widths are available on
all platforms for all the kinds of numeric types. Commonly 8-, 16-,
32-, 64-bit integers; 32-, 64-bit floats; and 64-, 128-bit complex
types are available.
Integer that can hold a pointer
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The constants **PyArray_INTP** and **PyArray_UINTP** refer to an
enumerated integer type that is large enough to hold a pointer on the
platform. Index arrays should always be converted to **PyArray_INTP**
, because the dimension of the array is of type npy_intp.
C-type names
------------
There are standard variable types for each of the numeric data types
and the bool data type. Some of these are already available in the
C-specification. You can create variables in extension code with these
types.
Boolean
^^^^^^^
.. ctype:: npy_bool
unsigned char; The constants :cdata:`NPY_FALSE` and
:cdata:`NPY_TRUE` are also defined.
(Un)Signed Integer
^^^^^^^^^^^^^^^^^^
Unsigned versions of the integers can be defined by pre-pending a 'u'
to the front of the integer name.
.. ctype:: npy_(u)byte
(unsigned) char
.. ctype:: npy_(u)short
(unsigned) short
.. ctype:: npy_(u)int
(unsigned) int
.. ctype:: npy_(u)long
(unsigned) long int
.. ctype:: npy_(u)longlong
(unsigned long long int)
.. ctype:: npy_(u)intp
(unsigned) Py_intptr_t (an integer that is the size of a pointer on
the platform).
(Complex) Floating point
^^^^^^^^^^^^^^^^^^^^^^^^
.. ctype:: npy_(c)float
float
.. ctype:: npy_(c)double
double
.. ctype:: npy_(c)longdouble
long double
complex types are structures with **.real** and **.imag** members (in
that order).
Bit-width names
^^^^^^^^^^^^^^^
There are also typedefs for signed integers, unsigned integers,
floating point, and complex floating point types of specific bit-
widths. The available type names are
:ctype:`npy_int{bits}`, :ctype:`npy_uint{bits}`, :ctype:`npy_float{bits}`,
and :ctype:`npy_complex{bits}`
where ``{bits}`` is the number of bits in the type and can be **8**,
**16**, **32**, **64**, 128, and 256 for integer types; 16, **32**
, **64**, 80, 96, 128, and 256 for floating-point types; and 32,
**64**, **128**, 160, 192, and 512 for complex-valued types. Which
bit-widths are available is platform dependent. The bolded bit-widths
are usually available on all platforms.
Printf Formatting
-----------------
For help in printing, the following strings are defined as the correct
format specifier in printf and related commands.
:cdata:`NPY_LONGLONG_FMT`, :cdata:`NPY_ULONGLONG_FMT`,
:cdata:`NPY_INTP_FMT`, :cdata:`NPY_UINTP_FMT`,
:cdata:`NPY_LONGDOUBLE_FMT`

View file

@ -1,175 +0,0 @@
==================================
Generalized Universal Function API
==================================
There is a general need for looping over not only functions on scalars
but also over functions on vectors (or arrays), as explained on
http://scipy.org/scipy/numpy/wiki/GeneralLoopingFunctions. We propose
to realize this concept by generalizing the universal functions
(ufuncs), and provide a C implementation that adds ~500 lines
to the numpy code base. In current (specialized) ufuncs, the elementary
function is limited to element-by-element operations, whereas the
generalized version supports "sub-array" by "sub-array" operations.
The Perl vector library PDL provides a similar functionality and its
terms are re-used in the following.
Each generalized ufunc has information associated with it that states
what the "core" dimensionality of the inputs is, as well as the
corresponding dimensionality of the outputs (the element-wise ufuncs
have zero core dimensions). The list of the core dimensions for all
arguments is called the "signature" of a ufunc. For example, the
ufunc numpy.add has signature ``(),()->()`` defining two scalar inputs
and one scalar output.
Another example is (see the GeneralLoopingFunctions page) the function
``inner1d(a,b)`` with a signature of ``(i),(i)->()``. This applies the
inner product along the last axis of each input, but keeps the
remaining indices intact. For example, where ``a`` is of shape ``(3,5,N)``
and ``b`` is of shape ``(5,N)``, this will return an output of shape ``(3,5)``.
The underlying elementary function is called 3*5 times. In the
signature, we specify one core dimension ``(i)`` for each input and zero core
dimensions ``()`` for the output, since it takes two 1-d arrays and
returns a scalar. By using the same name ``i``, we specify that the two
corresponding dimensions should be of the same size (or one of them is
of size 1 and will be broadcasted).
The dimensions beyond the core dimensions are called "loop" dimensions. In
the above example, this corresponds to ``(3,5)``.
The usual numpy "broadcasting" rules apply, where the signature
determines how the dimensions of each input/output object are split
into core and loop dimensions:
#. While an input array has a smaller dimensionality than the corresponding
number of core dimensions, 1's are pre-pended to its shape.
#. The core dimensions are removed from all inputs and the remaining
dimensions are broadcasted; defining the loop dimensions.
#. The output is given by the loop dimensions plus the output core dimensions.
Definitions
-----------
Elementary Function
Each ufunc consists of an elementary function that performs the
most basic operation on the smallest portion of array arguments
(e.g. adding two numbers is the most basic operation in adding two
arrays). The ufunc applies the elementary function multiple times
on different parts of the arrays. The input/output of elementary
functions can be vectors; e.g., the elementary function of inner1d
takes two vectors as input.
Signature
A signature is a string describing the input/output dimensions of
the elementary function of a ufunc. See section below for more
details.
Core Dimension
The dimensionality of each input/output of an elementary function
is defined by its core dimensions (zero core dimensions correspond
to a scalar input/output). The core dimensions are mapped to the
last dimensions of the input/output arrays.
Dimension Name
A dimension name represents a core dimension in the signature.
Different dimensions may share a name, indicating that they are of
the same size (or are broadcastable).
Dimension Index
A dimension index is an integer representing a dimension name. It
enumerates the dimension names according to the order of the first
occurrence of each name in the signature.
Details of Signature
--------------------
The signature defines "core" dimensionality of input and output
variables, and thereby also defines the contraction of the
dimensions. The signature is represented by a string of the
following format:
* Core dimensions of each input or output array are represented by a
list of dimension names in parentheses, ``(i_1,...,i_N)``; a scalar
input/output is denoted by ``()``. Instead of ``i_1``, ``i_2``,
etc, one can use any valid Python variable name.
* Dimension lists for different arguments are separated by ``","``.
Input/output arguments are separated by ``"->"``.
* If one uses the same dimension name in multiple locations, this
enforces the same size (or broadcastable size) of the corresponding
dimensions.
The formal syntax of signatures is as follows::
<Signature> ::= <Input arguments> "->" <Output arguments>
<Input arguments> ::= <Argument list>
<Output arguments> ::= <Argument list>
<Argument list> ::= nil | <Argument> | <Argument> "," <Argument list>
<Argument> ::= "(" <Core dimension list> ")"
<Core dimension list> ::= nil | <Dimension name> |
<Dimension name> "," <Core dimension list>
<Dimension name> ::= valid Python variable name
Notes:
#. All quotes are for clarity.
#. Core dimensions that share the same name must be broadcastable, as
the two ``i`` in our example above. Each dimension name typically
corresponding to one level of looping in the elementary function's
implementation.
#. White spaces are ignored.
Here are some examples of signatures:
+-------------+------------------------+-----------------------------------+
| add | ``(),()->()`` | |
+-------------+------------------------+-----------------------------------+
| inner1d | ``(i),(i)->()`` | |
+-------------+------------------------+-----------------------------------+
| sum1d | ``(i)->()`` | |
+-------------+------------------------+-----------------------------------+
| dot2d | ``(m,n),(n,p)->(m,p)`` | matrix multiplication |
+-------------+------------------------+-----------------------------------+
| outer_inner | ``(i,t),(j,t)->(i,j)`` | inner over the last dimension, |
| | | outer over the second to last, |
| | | and loop/broadcast over the rest. |
+-------------+------------------------+-----------------------------------+
C-API for implementing Elementary Functions
-------------------------------------------
The current interface remains unchanged, and ``PyUFunc_FromFuncAndData``
can still be used to implement (specialized) ufuncs, consisting of
scalar elementary functions.
One can use ``PyUFunc_FromFuncAndDataAndSignature`` to declare a more
general ufunc. The argument list is the same as
``PyUFunc_FromFuncAndData``, with an additional argument specifying the
signature as C string.
Furthermore, the callback function is of the same type as before,
``void (*foo)(char **args, intp *dimensions, intp *steps, void *func)``.
When invoked, ``args`` is a list of length ``nargs`` containing
the data of all input/output arguments. For a scalar elementary
function, ``steps`` is also of length ``nargs``, denoting the strides used
for the arguments. ``dimensions`` is a pointer to a single integer
defining the size of the axis to be looped over.
For a non-trivial signature, ``dimensions`` will also contain the sizes
of the core dimensions as well, starting at the second entry. Only
one size is provided for each unique dimension name and the sizes are
given according to the first occurrence of a dimension name in the
signature.
The first ``nargs`` elements of ``steps`` remain the same as for scalar
ufuncs. The following elements contain the strides of all core
dimensions for all arguments in order.
For example, consider a ufunc with signature ``(i,j),(i)->()``. In
this case, ``args`` will contain three pointers to the data of the
input/output arrays ``a``, ``b``, ``c``. Furthermore, ``dimensions`` will be
``[N, I, J]`` to define the size of ``N`` of the loop and the sizes ``I`` and ``J``
for the core dimensions ``i`` and ``j``. Finally, ``steps`` will be
``[a_N, b_N, c_N, a_i, a_j, b_i]``, containing all necessary strides.

View file

@ -1,49 +0,0 @@
.. _c-api:
###########
Numpy C-API
###########
.. sectionauthor:: Travis E. Oliphant
| Beware of the man who won't be bothered with details.
| --- *William Feather, Sr.*
| The truth is out there.
| --- *Chris Carter, The X Files*
NumPy provides a C-API to enable users to extend the system and get
access to the array object for use in other routines. The best way to
truly understand the C-API is to read the source code. If you are
unfamiliar with (C) source code, however, this can be a daunting
experience at first. Be assured that the task becomes easier with
practice, and you may be surprised at how simple the C-code can be to
understand. Even if you don't think you can write C-code from scratch,
it is much easier to understand and modify already-written source code
then create it *de novo*.
Python extensions are especially straightforward to understand because
they all have a very similar structure. Admittedly, NumPy is not a
trivial extension to Python, and may take a little more snooping to
grasp. This is especially true because of the code-generation
techniques, which simplify maintenance of very similar code, but can
make the code a little less readable to beginners. Still, with a
little persistence, the code can be opened to your understanding. It
is my hope, that this guide to the C-API can assist in the process of
becoming familiar with the compiled-level work that can be done with
NumPy in order to squeeze that last bit of necessary speed out of your
code.
.. currentmodule:: numpy-c-api
.. toctree::
:maxdepth: 2
c-api.types-and-structures
c-api.config
c-api.dtype
c-api.array
c-api.ufunc
c-api.generalized-ufuncs
c-api.coremath

View file

@ -1,367 +0,0 @@
UFunc API
=========
.. sectionauthor:: Travis E. Oliphant
.. index::
pair: ufunc; C-API
Constants
---------
.. cvar:: UFUNC_ERR_{HANDLER}
``{HANDLER}`` can be **IGNORE**, **WARN**, **RAISE**, or **CALL**
.. cvar:: UFUNC_{THING}_{ERR}
``{THING}`` can be **MASK**, **SHIFT**, or **FPE**, and ``{ERR}`` can
be **DIVIDEBYZERO**, **OVERFLOW**, **UNDERFLOW**, and **INVALID**.
.. cvar:: PyUFunc_{VALUE}
``{VALUE}`` can be **One** (1), **Zero** (0), or **None** (-1)
Macros
------
.. cmacro:: NPY_LOOP_BEGIN_THREADS
Used in universal function code to only release the Python GIL if
loop->obj is not true (*i.e.* this is not an OBJECT array
loop). Requires use of :cmacro:`NPY_BEGIN_THREADS_DEF` in variable
declaration area.
.. cmacro:: NPY_LOOP_END_THREADS
Used in universal function code to re-acquire the Python GIL if it
was released (because loop->obj was not true).
.. cfunction:: UFUNC_CHECK_ERROR(loop)
A macro used internally to check for errors and goto fail if
found. This macro requires a fail label in the current code
block. The *loop* variable must have at least members (obj,
errormask, and errorobj). If *loop* ->obj is nonzero, then
:cfunc:`PyErr_Occurred` () is called (meaning the GIL must be held). If
*loop* ->obj is zero, then if *loop* ->errormask is nonzero,
:cfunc:`PyUFunc_checkfperr` is called with arguments *loop* ->errormask
and *loop* ->errobj. If the result of this check of the IEEE
floating point registers is true then the code redirects to the
fail label which must be defined.
.. cfunction:: UFUNC_CHECK_STATUS(ret)
A macro that expands to platform-dependent code. The *ret*
variable can can be any integer. The :cdata:`UFUNC_FPE_{ERR}` bits are
set in *ret* according to the status of the corresponding error
flags of the floating point processor.
Functions
---------
.. cfunction:: PyObject* PyUFunc_FromFuncAndData(PyUFuncGenericFunction* func,
void** data, char* types, int ntypes, int nin, int nout, int identity,
char* name, char* doc, int check_return)
Create a new broadcasting universal function from required variables.
Each ufunc builds around the notion of an element-by-element
operation. Each ufunc object contains pointers to 1-d loops
implementing the basic functionality for each supported type.
.. note::
The *func*, *data*, *types*, *name*, and *doc* arguments are not
copied by :cfunc:`PyUFunc_FromFuncAndData`. The caller must ensure
that the memory used by these arrays is not freed as long as the
ufunc object is alive.
:param func:
Must to an array of length *ntypes* containing
:ctype:`PyUFuncGenericFunction` items. These items are pointers to
functions that actually implement the underlying
(element-by-element) function :math:`N` times.
:param data:
Should be ``NULL`` or a pointer to an array of size *ntypes*
. This array may contain arbitrary extra-data to be passed to
the corresponding 1-d loop function in the func array.
:param types:
Must be of length (*nin* + *nout*) \* *ntypes*, and it
contains the data-types (built-in only) that the corresponding
function in the *func* array can deal with.
:param ntypes:
How many different data-type "signatures" the ufunc has implemented.
:param nin:
The number of inputs to this operation.
:param nout:
The number of outputs
:param name:
The name for the ufunc. Specifying a name of 'add' or
'multiply' enables a special behavior for integer-typed
reductions when no dtype is given. If the input type is an
integer (or boolean) data type smaller than the size of the int_
data type, it will be internally upcast to the int_ (or uint)
data type.
:param doc:
Allows passing in a documentation string to be stored with the
ufunc. The documentation string should not contain the name
of the function or the calling signature as that will be
dynamically determined from the object and available when
accessing the **__doc__** attribute of the ufunc.
:param check_return:
Unused and present for backwards compatibility of the C-API. A
corresponding *check_return* integer does exist in the ufunc
structure and it does get set with this value when the ufunc
object is created.
.. cfunction:: int PyUFunc_RegisterLoopForType(PyUFuncObject* ufunc,
int usertype, PyUFuncGenericFunction function, int* arg_types, void* data)
This function allows the user to register a 1-d loop with an
already- created ufunc to be used whenever the ufunc is called
with any of its input arguments as the user-defined
data-type. This is needed in order to make ufuncs work with
built-in data-types. The data-type must have been previously
registered with the numpy system. The loop is passed in as
*function*. This loop can take arbitrary data which should be
passed in as *data*. The data-types the loop requires are passed
in as *arg_types* which must be a pointer to memory at least as
large as ufunc->nargs.
.. cfunction:: int PyUFunc_ReplaceLoopBySignature(PyUFuncObject* ufunc,
PyUFuncGenericFunction newfunc, int* signature,
PyUFuncGenericFunction* oldfunc)
Replace a 1-d loop matching the given *signature* in the
already-created *ufunc* with the new 1-d loop newfunc. Return the
old 1-d loop function in *oldfunc*. Return 0 on success and -1 on
failure. This function works only with built-in types (use
:cfunc:`PyUFunc_RegisterLoopForType` for user-defined types). A
signature is an array of data-type numbers indicating the inputs
followed by the outputs assumed by the 1-d loop.
.. cfunction:: int PyUFunc_GenericFunction(PyUFuncObject* self,
PyObject* args, PyArrayObject** mps)
A generic ufunc call. The ufunc is passed in as *self*, the
arguments to the ufunc as *args*. The *mps* argument is an array
of :ctype:`PyArrayObject` pointers containing the converted input
arguments as well as the ufunc outputs on return. The user is
responsible for managing this array and receives a new reference
for each array in *mps*. The total number of arrays in *mps* is
given by *self* ->nin + *self* ->nout.
.. cfunction:: int PyUFunc_checkfperr(int errmask, PyObject* errobj)
A simple interface to the IEEE error-flag checking support. The
*errmask* argument is a mask of :cdata:`UFUNC_MASK_{ERR}` bitmasks
indicating which errors to check for (and how to check for
them). The *errobj* must be a Python tuple with two elements: a
string containing the name which will be used in any communication
of error and either a callable Python object (call-back function)
or :cdata:`Py_None`. The callable object will only be used if
:cdata:`UFUNC_ERR_CALL` is set as the desired error checking
method. This routine manages the GIL and is safe to call even
after releasing the GIL. If an error in the IEEE-compatibile
hardware is determined a -1 is returned, otherwise a 0 is
returned.
.. cfunction:: void PyUFunc_clearfperr()
Clear the IEEE error flags.
.. cfunction:: void PyUFunc_GetPyValues(char* name, int* bufsize,
int* errmask, PyObject** errobj)
Get the Python values used for ufunc processing from the
thread-local storage area unless the defaults have been set in
which case the name lookup is bypassed. The name is placed as a
string in the first element of *\*errobj*. The second element is
the looked-up function to call on error callback. The value of the
looked-up buffer-size to use is passed into *bufsize*, and the
value of the error mask is placed into *errmask*.
Generic functions
-----------------
At the core of every ufunc is a collection of type-specific functions
that defines the basic functionality for each of the supported types.
These functions must evaluate the underlying function :math:`N\geq1`
times. Extra-data may be passed in that may be used during the
calculation. This feature allows some general functions to be used as
these basic looping functions. The general function has all the code
needed to point variables to the right place and set up a function
call. The general function assumes that the actual function to call is
passed in as the extra data and calls it with the correct values. All
of these functions are suitable for placing directly in the array of
functions stored in the functions member of the PyUFuncObject
structure.
.. cfunction:: void PyUFunc_f_f_As_d_d(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_d_d(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_f_f(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_g_g(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_F_F_As_D_D(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_F_F(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_D_D(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_G_G(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
Type specific, core 1-d functions for ufuncs where each
calculation is obtained by calling a function taking one input
argument and returning one output. This function is passed in
``func``. The letters correspond to dtypechar's of the supported
data types ( ``f`` - float, ``d`` - double, ``g`` - long double,
``F`` - cfloat, ``D`` - cdouble, ``G`` - clongdouble). The
argument *func* must support the same signature. The _As_X_X
variants assume ndarray's of one data type but cast the values to
use an underlying function that takes a different data type. Thus,
:cfunc:`PyUFunc_f_f_As_d_d` uses ndarrays of data type :cdata:`NPY_FLOAT`
but calls out to a C-function that takes double and returns
double.
.. cfunction:: void PyUFunc_ff_f_As_dd_d(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_ff_f(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_dd_d(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_gg_g(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_FF_F_As_DD_D(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_DD_D(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_FF_F(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_GG_G(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
Type specific, core 1-d functions for ufuncs where each
calculation is obtained by calling a function taking two input
arguments and returning one output. The underlying function to
call is passed in as *func*. The letters correspond to
dtypechar's of the specific data type supported by the
general-purpose function. The argument ``func`` must support the
corresponding signature. The ``_As_XX_X`` variants assume ndarrays
of one data type but cast the values at each iteration of the loop
to use the underlying function that takes a different data type.
.. cfunction:: void PyUFunc_O_O(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
.. cfunction:: void PyUFunc_OO_O(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
One-input, one-output, and two-input, one-output core 1-d functions
for the :cdata:`NPY_OBJECT` data type. These functions handle reference
count issues and return early on error. The actual function to call is
*func* and it must accept calls with the signature ``(PyObject*)
(PyObject*)`` for :cfunc:`PyUFunc_O_O` or ``(PyObject*)(PyObject *,
PyObject *)`` for :cfunc:`PyUFunc_OO_O`.
.. cfunction:: void PyUFunc_O_O_method(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
This general purpose 1-d core function assumes that *func* is a string
representing a method of the input object. For each
iteration of the loop, the Python obejct is extracted from the array
and its *func* method is called returning the result to the output array.
.. cfunction:: void PyUFunc_OO_O_method(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
This general purpose 1-d core function assumes that *func* is a
string representing a method of the input object that takes one
argument. The first argument in *args* is the method whose function is
called, the second argument in *args* is the argument passed to the
function. The output of the function is stored in the third entry
of *args*.
.. cfunction:: void PyUFunc_On_Om(char** args, npy_intp* dimensions,
npy_intp* steps, void* func)
This is the 1-d core function used by the dynamic ufuncs created
by umath.frompyfunc(function, nin, nout). In this case *func* is a
pointer to a :ctype:`PyUFunc_PyFuncData` structure which has definition
.. ctype:: PyUFunc_PyFuncData
.. code-block:: c
typedef struct {
int nin;
int nout;
PyObject *callable;
} PyUFunc_PyFuncData;
At each iteration of the loop, the *nin* input objects are exctracted
from their object arrays and placed into an argument tuple, the Python
*callable* is called with the input arguments, and the nout
outputs are placed into their object arrays.
Importing the API
-----------------
.. cvar:: PY_UFUNC_UNIQUE_SYMBOL
.. cvar:: NO_IMPORT_UFUNC
.. cfunction:: void import_ufunc(void)
These are the constants and functions for accessing the ufunc
C-API from extension modules in precisely the same way as the
array C-API can be accessed. The ``import_ufunc`` () function must
always be called (in the initialization subroutine of the
extension module). If your extension module is in one file then
that is all that is required. The other two constants are useful
if your extension module makes use of multiple files. In that
case, define :cdata:`PY_UFUNC_UNIQUE_SYMBOL` to something unique to
your code and then in source files that do not contain the module
initialization function but still need access to the UFUNC API,
define :cdata:`PY_UFUNC_UNIQUE_SYMBOL` to the same name used previously
and also define :cdata:`NO_IMPORT_UFUNC`.
The C-API is actually an array of function pointers. This array is
created (and pointed to by a global variable) by import_ufunc. The
global variable is either statically defined or allowed to be seen
by other files depending on the state of
:cdata:`Py_UFUNC_UNIQUE_SYMBOL` and :cdata:`NO_IMPORT_UFUNC`.
.. index::
pair: ufunc; C-API

View file

@ -1,316 +0,0 @@
**********************************
Packaging (:mod:`numpy.distutils`)
**********************************
.. module:: numpy.distutils
NumPy provides enhanced distutils functionality to make it easier to
build and install sub-packages, auto-generate code, and extension
modules that use Fortran-compiled libraries. To use features of NumPy
distutils, use the :func:`setup <core.setup>` command from
:mod:`numpy.distutils.core`. A useful :class:`Configuration
<misc_util.Configuration>` class is also provided in
:mod:`numpy.distutils.misc_util` that can make it easier to construct
keyword arguments to pass to the setup function (by passing the
dictionary obtained from the todict() method of the class). More
information is available in the NumPy Distutils Users Guide in
``<site-packages>/numpy/doc/DISTUTILS.txt``.
.. index::
single: distutils
Modules in :mod:`numpy.distutils`
=================================
misc_util
---------
.. module:: numpy.distutils.misc_util
.. autosummary::
:toctree: generated/
Configuration
get_numpy_include_dirs
dict_append
appendpath
allpath
dot_join
generate_config_py
get_cmd
terminal_has_colors
red_text
green_text
yellow_text
blue_text
cyan_text
cyg2win32
all_strings
has_f_sources
has_cxx_sources
filter_sources
get_dependencies
is_local_src_dir
get_ext_source_files
get_script_files
.. class:: Configuration(package_name=None, parent_name=None, top_path=None, package_path=None, **attrs)
Construct a configuration instance for the given package name. If
*parent_name* is not :const:`None`, then construct the package as a
sub-package of the *parent_name* package. If *top_path* and
*package_path* are :const:`None` then they are assumed equal to
the path of the file this instance was created in. The setup.py
files in the numpy distribution are good examples of how to use
the :class:`Configuration` instance.
.. automethod:: todict
.. automethod:: get_distribution
.. automethod:: get_subpackage
.. automethod:: add_subpackage
.. automethod:: add_data_files
.. automethod:: add_data_dir
.. automethod:: add_include_dirs
.. automethod:: add_headers
.. automethod:: add_extension
.. automethod:: add_library
.. automethod:: add_scripts
.. automethod:: add_installed_library
.. automethod:: add_npy_pkg_config
.. automethod:: paths
.. automethod:: get_config_cmd
.. automethod:: get_build_temp_dir
.. automethod:: have_f77c
.. automethod:: have_f90c
.. automethod:: get_version
.. automethod:: make_svn_version_py
.. automethod:: make_config_py
.. automethod:: get_info
Other modules
-------------
.. currentmodule:: numpy.distutils
.. autosummary::
:toctree: generated/
system_info.get_info
system_info.get_standard_file
cpuinfo.cpu
log.set_verbosity
exec_command
Building Installable C libraries
================================
Conventional C libraries (installed through `add_library`) are not installed, and
are just used during the build (they are statically linked). An installable C
library is a pure C library, which does not depend on the python C runtime, and
is installed such that it may be used by third-party packages. To build and
install the C library, you just use the method `add_installed_library` instead of
`add_library`, which takes the same arguments except for an additional
``install_dir`` argument::
>>> config.add_installed_library('foo', sources=['foo.c'], install_dir='lib')
npy-pkg-config files
--------------------
To make the necessary build options available to third parties, you could use
the `npy-pkg-config` mechanism implemented in `numpy.distutils`. This mechanism is
based on a .ini file which contains all the options. A .ini file is very
similar to .pc files as used by the pkg-config unix utility::
[meta]
Name: foo
Version: 1.0
Description: foo library
[variables]
prefix = /home/user/local
libdir = ${prefix}/lib
includedir = ${prefix}/include
[default]
cflags = -I${includedir}
libs = -L${libdir} -lfoo
Generally, the file needs to be generated during the build, since it needs some
information known at build time only (e.g. prefix). This is mostly automatic if
one uses the `Configuration` method `add_npy_pkg_config`. Assuming we have a
template file foo.ini.in as follows::
[meta]
Name: foo
Version: @version@
Description: foo library
[variables]
prefix = @prefix@
libdir = ${prefix}/lib
includedir = ${prefix}/include
[default]
cflags = -I${includedir}
libs = -L${libdir} -lfoo
and the following code in setup.py::
>>> config.add_installed_library('foo', sources=['foo.c'], install_dir='lib')
>>> subst = {'version': '1.0'}
>>> config.add_npy_pkg_config('foo.ini.in', 'lib', subst_dict=subst)
This will install the file foo.ini into the directory package_dir/lib, and the
foo.ini file will be generated from foo.ini.in, where each ``@version@`` will be
replaced by ``subst_dict['version']``. The dictionary has an additional prefix
substitution rule automatically added, which contains the install prefix (since
this is not easy to get from setup.py). npy-pkg-config files can also be
installed at the same location as used for numpy, using the path returned from
`get_npy_pkg_dir` function.
Reusing a C library from another package
----------------------------------------
Info are easily retrieved from the `get_info` function in
`numpy.distutils.misc_util`::
>>> info = get_info('npymath')
>>> config.add_extension('foo', sources=['foo.c'], extra_info=**info)
An additional list of paths to look for .ini files can be given to `get_info`.
Conversion of ``.src`` files
============================
NumPy distutils supports automatic conversion of source files named
<somefile>.src. This facility can be used to maintain very similar
code blocks requiring only simple changes between blocks. During the
build phase of setup, if a template file named <somefile>.src is
encountered, a new file named <somefile> is constructed from the
template and placed in the build directory to be used instead. Two
forms of template conversion are supported. The first form occurs for
files named named <file>.ext.src where ext is a recognized Fortran
extension (f, f90, f95, f77, for, ftn, pyf). The second form is used
for all other cases.
.. index::
single: code generation
Fortran files
-------------
This template converter will replicate all **function** and
**subroutine** blocks in the file with names that contain '<...>'
according to the rules in '<...>'. The number of comma-separated words
in '<...>' determines the number of times the block is repeated. What
these words are indicates what that repeat rule, '<...>', should be
replaced with in each block. All of the repeat rules in a block must
contain the same number of comma-separated words indicating the number
of times that block should be repeated. If the word in the repeat rule
needs a comma, leftarrow, or rightarrow, then prepend it with a
backslash ' \'. If a word in the repeat rule matches ' \\<index>' then
it will be replaced with the <index>-th word in the same repeat
specification. There are two forms for the repeat rule: named and
short.
Named repeat rule
^^^^^^^^^^^^^^^^^
A named repeat rule is useful when the same set of repeats must be
used several times in a block. It is specified using <rule1=item1,
item2, item3,..., itemN>, where N is the number of times the block
should be repeated. On each repeat of the block, the entire
expression, '<...>' will be replaced first with item1, and then with
item2, and so forth until N repeats are accomplished. Once a named
repeat specification has been introduced, the same repeat rule may be
used **in the current block** by referring only to the name
(i.e. <rule1>.
Short repeat rule
^^^^^^^^^^^^^^^^^
A short repeat rule looks like <item1, item2, item3, ..., itemN>. The
rule specifies that the entire expression, '<...>' should be replaced
first with item1, and then with item2, and so forth until N repeats
are accomplished.
Pre-defined names
^^^^^^^^^^^^^^^^^
The following predefined named repeat rules are available:
- <prefix=s,d,c,z>
- <_c=s,d,c,z>
- <_t=real, double precision, complex, double complex>
- <ftype=real, double precision, complex, double complex>
- <ctype=float, double, complex_float, complex_double>
- <ftypereal=float, double precision, \\0, \\1>
- <ctypereal=float, double, \\0, \\1>
Other files
-----------
Non-Fortran files use a separate syntax for defining template blocks
that should be repeated using a variable expansion similar to the
named repeat rules of the Fortran-specific repeats. The template rules
for these files are:
1. "/\**begin repeat "on a line by itself marks the beginning of
a segment that should be repeated.
2. Named variable expansions are defined using #name=item1, item2, item3,
..., itemN# and placed on successive lines. These variables are
replaced in each repeat block with corresponding word. All named
variables in the same repeat block must define the same number of
words.
3. In specifying the repeat rule for a named variable, item*N is short-
hand for item, item, ..., item repeated N times. In addition,
parenthesis in combination with \*N can be used for grouping several
items that should be repeated. Thus, #name=(item1, item2)*4# is
equivalent to #name=item1, item2, item1, item2, item1, item2, item1,
item2#
4. "\*/ "on a line by itself marks the end of the the variable expansion
naming. The next line is the first line that will be repeated using
the named rules.
5. Inside the block to be repeated, the variables that should be expanded
are specified as @name@.
6. "/\**end repeat**/ "on a line by itself marks the previous line
as the last line of the block to be repeated.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 128 KiB

View file

@ -1,57 +0,0 @@
#FIG 3.2
Landscape
Center
Inches
Letter
100.00
Single
-2
1200 2
6 1950 2850 4350 3450
2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
1950 2850 4350 2850 4350 3450 1950 3450 1950 2850
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
2550 2850 2550 3450
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
3150 2850 3150 3450
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
3750 2850 3750 3450
-6
2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
5100 2850 7500 2850 7500 3450 5100 3450 5100 2850
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
5700 2850 5700 3450
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
6300 2850 6300 3450
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2
6900 2850 6900 3450
2 4 0 1 0 7 50 -1 -1 0.000 0 0 7 0 0 5
7800 3600 7800 2700 525 2700 525 3600 7800 3600
2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
675 2850 1725 2850 1725 3450 675 3450 675 2850
2 2 0 4 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
5700 2850 6300 2850 6300 3450 5700 3450 5700 2850
2 2 0 4 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
5700 1725 6300 1725 6300 2325 5700 2325 5700 1725
2 4 0 1 0 7 50 -1 -1 0.000 0 0 7 0 0 5
6450 2475 6450 1275 5550 1275 5550 2475 6450 2475
2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5
5700 1350 6300 1350 6300 1575 5700 1575 5700 1350
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3
2 1 1.00 60.00 120.00
900 2850 900 1875 1575 1875
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2
2 1 1.00 60.00 120.00
3375 1800 5550 1800
2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2
2 1 1.00 60.00 120.00
6000 2850 6000 2325
2 4 0 1 0 7 50 -1 -1 0.000 0 0 7 0 0 5
3375 2100 3375 1575 1575 1575 1575 2100 3375 2100
4 0 0 50 -1 18 14 0.0000 4 165 720 825 3225 header\001
4 0 0 50 -1 2 40 0.0000 4 105 450 4500 3225 ...\001
4 0 0 50 -1 18 14 0.0000 4 210 810 3600 3900 ndarray\001
4 0 0 50 -1 18 14 0.0000 4 165 630 6600 2175 scalar\001
4 0 0 50 -1 18 14 0.0000 4 165 540 6600 1950 array\001
4 0 0 50 -1 16 12 0.0000 4 135 420 5775 1500 head\001
4 0 0 50 -1 18 14 0.0000 4 210 975 1950 1875 data-type\001

Binary file not shown.

Before

Width:  |  Height:  |  Size: 5.5 KiB

View file

@ -1,42 +0,0 @@
.. _reference:
###############
NumPy Reference
###############
:Release: |version|
:Date: |today|
.. module:: numpy
This reference manual details functions, modules, and objects
included in Numpy, describing what they are and what they do.
For learning how to use NumPy, see also :ref:`user`.
.. toctree::
:maxdepth: 2
arrays
ufuncs
routines
ctypes
distutils
c-api
internals
Acknowledgements
================
Large parts of this manual originate from Travis E. Oliphant's book
`Guide to Numpy <http://www.tramy.us/>`__ (which generously entered
Public Domain in August 2008). The reference documentation for many of
the functions are written by numerous contributors and developers of
Numpy, both prior to and during the
`Numpy Documentation Marathon
<http://scipy.org/Developer_Zone/DocMarathon2008>`__.
Please help to improve NumPy's documentation! Instructions on how to
join the ongoing documentation marathon can be found
`on the scipy.org website <http://scipy.org/Developer_Zone/DocMarathon2008>`__

View file

@ -1,666 +0,0 @@
.. currentmodule:: numpy
*************************
Numpy C Code Explanations
*************************
Fanaticism consists of redoubling your efforts when you have forgotten
your aim.
--- *George Santayana*
An authority is a person who can tell you more about something than
you really care to know.
--- *Unknown*
This Chapter attempts to explain the logic behind some of the new
pieces of code. The purpose behind these explanations is to enable
somebody to be able to understand the ideas behind the implementation
somewhat more easily than just staring at the code. Perhaps in this
way, the algorithms can be improved on, borrowed from, and/or
optimized.
Memory model
============
.. index::
pair: ndarray; memory model
One fundamental aspect of the ndarray is that an array is seen as a
"chunk" of memory starting at some location. The interpretation of
this memory depends on the stride information. For each dimension in
an :math:`N` -dimensional array, an integer (stride) dictates how many
bytes must be skipped to get to the next element in that dimension.
Unless you have a single-segment array, this stride information must
be consulted when traversing through an array. It is not difficult to
write code that accepts strides, you just have to use (char \*)
pointers because strides are in units of bytes. Keep in mind also that
strides do not have to be unit-multiples of the element size. Also,
remember that if the number of dimensions of the array is 0 (sometimes
called a rank-0 array), then the strides and dimensions variables are
NULL.
Besides the structural information contained in the strides and
dimensions members of the :ctype:`PyArrayObject`, the flags contain important
information about how the data may be accessed. In particular, the
:cdata:`NPY_ALIGNED` flag is set when the memory is on a suitable boundary
according to the data-type array. Even if you have a contiguous chunk
of memory, you cannot just assume it is safe to dereference a data-
type-specific pointer to an element. Only if the :cdata:`NPY_ALIGNED` flag is
set is this a safe operation (on some platforms it will work but on
others, like Solaris, it will cause a bus error). The :cdata:`NPY_WRITEABLE`
should also be ensured if you plan on writing to the memory area of
the array. It is also possible to obtain a pointer to an unwriteable
memory area. Sometimes, writing to the memory area when the
:cdata:`NPY_WRITEABLE` flag is not set will just be rude. Other times it can
cause program crashes ( *e.g.* a data-area that is a read-only
memory-mapped file).
Data-type encapsulation
=======================
.. index::
single: dtype
The data-type is an important abstraction of the ndarray. Operations
will look to the data-type to provide the key functionality that is
needed to operate on the array. This functionality is provided in the
list of function pointers pointed to by the 'f' member of the
:ctype:`PyArray_Descr` structure. In this way, the number of data-types can be
extended simply by providing a :ctype:`PyArray_Descr` structure with suitable
function pointers in the 'f' member. For built-in types there are some
optimizations that by-pass this mechanism, but the point of the data-
type abstraction is to allow new data-types to be added.
One of the built-in data-types, the void data-type allows for
arbitrary records containing 1 or more fields as elements of the
array. A field is simply another data-type object along with an offset
into the current record. In order to support arbitrarily nested
fields, several recursive implementations of data-type access are
implemented for the void type. A common idiom is to cycle through the
elements of the dictionary and perform a specific operation based on
the data-type object stored at the given offset. These offsets can be
arbitrary numbers. Therefore, the possibility of encountering mis-
aligned data must be recognized and taken into account if necessary.
N-D Iterators
=============
.. index::
single: array iterator
A very common operation in much of NumPy code is the need to iterate
over all the elements of a general, strided, N-dimensional array. This
operation of a general-purpose N-dimensional loop is abstracted in the
notion of an iterator object. To write an N-dimensional loop, you only
have to create an iterator object from an ndarray, work with the
dataptr member of the iterator object structure and call the macro
:cfunc:`PyArray_ITER_NEXT` (it) on the iterator object to move to the next
element. The "next" element is always in C-contiguous order. The macro
works by first special casing the C-contiguous, 1-D, and 2-D cases
which work very simply.
For the general case, the iteration works by keeping track of a list
of coordinate counters in the iterator object. At each iteration, the
last coordinate counter is increased (starting from 0). If this
counter is smaller then one less than the size of the array in that
dimension (a pre-computed and stored value), then the counter is
increased and the dataptr member is increased by the strides in that
dimension and the macro ends. If the end of a dimension is reached,
the counter for the last dimension is reset to zero and the dataptr is
moved back to the beginning of that dimension by subtracting the
strides value times one less than the number of elements in that
dimension (this is also pre-computed and stored in the backstrides
member of the iterator object). In this case, the macro does not end,
but a local dimension counter is decremented so that the next-to-last
dimension replaces the role that the last dimension played and the
previously-described tests are executed again on the next-to-last
dimension. In this way, the dataptr is adjusted appropriately for
arbitrary striding.
The coordinates member of the :ctype:`PyArrayIterObject` structure maintains
the current N-d counter unless the underlying array is C-contiguous in
which case the coordinate counting is by-passed. The index member of
the :ctype:`PyArrayIterObject` keeps track of the current flat index of the
iterator. It is updated by the :cfunc:`PyArray_ITER_NEXT` macro.
Broadcasting
============
.. index::
single: broadcasting
In Numeric, broadcasting was implemented in several lines of code
buried deep in ufuncobject.c. In NumPy, the notion of broadcasting has
been abstracted so that it can be performed in multiple places.
Broadcasting is handled by the function :cfunc:`PyArray_Broadcast`. This
function requires a :ctype:`PyArrayMultiIterObject` (or something that is a
binary equivalent) to be passed in. The :ctype:`PyArrayMultiIterObject` keeps
track of the broadcasted number of dimensions and size in each
dimension along with the total size of the broadcasted result. It also
keeps track of the number of arrays being broadcast and a pointer to
an iterator for each of the arrays being broadcasted.
The :cfunc:`PyArray_Broadcast` function takes the iterators that have already
been defined and uses them to determine the broadcast shape in each
dimension (to create the iterators at the same time that broadcasting
occurs then use the :cfunc:`PyMultiIter_New` function). Then, the iterators are
adjusted so that each iterator thinks it is iterating over an array
with the broadcasted size. This is done by adjusting the iterators
number of dimensions, and the shape in each dimension. This works
because the iterator strides are also adjusted. Broadcasting only
adjusts (or adds) length-1 dimensions. For these dimensions, the
strides variable is simply set to 0 so that the data-pointer for the
iterator over that array doesn't move as the broadcasting operation
operates over the extended dimension.
Broadcasting was always implemented in Numeric using 0-valued strides
for the extended dimensions. It is done in exactly the same way in
NumPy. The big difference is that now the array of strides is kept
track of in a :ctype:`PyArrayIterObject`, the iterators involved in a
broadcasted result are kept track of in a :ctype:`PyArrayMultiIterObject`,
and the :cfunc:`PyArray_BroadCast` call implements the broad-casting rules.
Array Scalars
=============
.. index::
single: array scalars
The array scalars offer a hierarchy of Python types that allow a one-
to-one correspondence between the data-type stored in an array and the
Python-type that is returned when an element is extracted from the
array. An exception to this rule was made with object arrays. Object
arrays are heterogeneous collections of arbitrary Python objects. When
you select an item from an object array, you get back the original
Python object (and not an object array scalar which does exist but is
rarely used for practical purposes).
The array scalars also offer the same methods and attributes as arrays
with the intent that the same code can be used to support arbitrary
dimensions (including 0-dimensions). The array scalars are read-only
(immutable) with the exception of the void scalar which can also be
written to so that record-array field setting works more naturally
(a[0]['f1'] = ``value`` ).
Advanced ("Fancy") Indexing
=============================
.. index::
single: indexing
The implementation of advanced indexing represents some of the most
difficult code to write and explain. In fact, there are two
implementations of advanced indexing. The first works only with 1-D
arrays and is implemented to handle expressions involving a.flat[obj].
The second is general-purpose that works for arrays of "arbitrary
dimension" (up to a fixed maximum). The one-dimensional indexing
approaches were implemented in a rather straightforward fashion, and
so it is the general-purpose indexing code that will be the focus of
this section.
There is a multi-layer approach to indexing because the indexing code
can at times return an array scalar and at other times return an
array. The functions with "_nice" appended to their name do this
special handling while the function without the _nice appendage always
return an array (perhaps a 0-dimensional array). Some special-case
optimizations (the index being an integer scalar, and the index being
a tuple with as many dimensions as the array) are handled in
array_subscript_nice function which is what Python calls when
presented with the code "a[obj]." These optimizations allow fast
single-integer indexing, and also ensure that a 0-dimensional array is
not created only to be discarded as the array scalar is returned
instead. This provides significant speed-up for code that is selecting
many scalars out of an array (such as in a loop). However, it is still
not faster than simply using a list to store standard Python scalars,
because that is optimized by the Python interpreter itself.
After these optimizations, the array_subscript function itself is
called. This function first checks for field selection which occurs
when a string is passed as the indexing object. Then, 0-D arrays are
given special-case consideration. Finally, the code determines whether
or not advanced, or fancy, indexing needs to be performed. If fancy
indexing is not needed, then standard view-based indexing is performed
using code borrowed from Numeric which parses the indexing object and
returns the offset into the data-buffer and the dimensions necessary
to create a new view of the array. The strides are also changed by
multiplying each stride by the step-size requested along the
corresponding dimension.
Fancy-indexing check
--------------------
The fancy_indexing_check routine determines whether or not to use
standard view-based indexing or new copy-based indexing. If the
indexing object is a tuple, then view-based indexing is assumed by
default. Only if the tuple contains an array object or a sequence
object is fancy-indexing assumed. If the indexing object is an array,
then fancy indexing is automatically assumed. If the indexing object
is any other kind of sequence, then fancy-indexing is assumed by
default. This is over-ridden to simple indexing if the sequence
contains any slice, newaxis, or Ellipsis objects, and no arrays or
additional sequences are also contained in the sequence. The purpose
of this is to allow the construction of "slicing" sequences which is a
common technique for building up code that works in arbitrary numbers
of dimensions.
Fancy-indexing implementation
-----------------------------
The concept of indexing was also abstracted using the idea of an
iterator. If fancy indexing is performed, then a :ctype:`PyArrayMapIterObject`
is created. This internal object is not exposed to Python. It is
created in order to handle the fancy-indexing at a high-level. Both
get and set fancy-indexing operations are implemented using this
object. Fancy indexing is abstracted into three separate operations:
(1) creating the :ctype:`PyArrayMapIterObject` from the indexing object, (2)
binding the :ctype:`PyArrayMapIterObject` to the array being indexed, and (3)
getting (or setting) the items determined by the indexing object.
There is an optimization implemented so that the :ctype:`PyArrayIterObject`
(which has it's own less complicated fancy-indexing) is used for
indexing when possible.
Creating the mapping object
^^^^^^^^^^^^^^^^^^^^^^^^^^^
The first step is to convert the indexing objects into a standard form
where iterators are created for all of the index array inputs and all
Boolean arrays are converted to equivalent integer index arrays (as if
nonzero(arr) had been called). Finally, all integer arrays are
replaced with the integer 0 in the indexing object and all of the
index-array iterators are "broadcast" to the same shape.
Binding the mapping object
^^^^^^^^^^^^^^^^^^^^^^^^^^
When the mapping object is created it does not know which array it
will be used with so once the index iterators are constructed during
mapping-object creation, the next step is to associate these iterators
with a particular ndarray. This process interprets any ellipsis and
slice objects so that the index arrays are associated with the
appropriate axis (the axis indicated by the iteraxis entry
corresponding to the iterator for the integer index array). This
information is then used to check the indices to be sure they are
within range of the shape of the array being indexed. The presence of
ellipsis and/or slice objects implies a sub-space iteration that is
accomplished by extracting a sub-space view of the array (using the
index object resulting from replacing all the integer index arrays
with 0) and storing the information about where this sub-space starts
in the mapping object. This is used later during mapping-object
iteration to select the correct elements from the underlying array.
Getting (or Setting)
^^^^^^^^^^^^^^^^^^^^
After the mapping object is successfully bound to a particular array,
the mapping object contains the shape of the resulting item as well as
iterator objects that will walk through the currently-bound array and
either get or set its elements as needed. The walk is implemented
using the :cfunc:`PyArray_MapIterNext` function. This function sets the
coordinates of an iterator object into the current array to be the
next coordinate location indicated by all of the indexing-object
iterators while adjusting, if necessary, for the presence of a sub-
space. The result of this function is that the dataptr member of the
mapping object structure is pointed to the next position in the array
that needs to be copied out or set to some value.
When advanced indexing is used to extract an array, an iterator for
the new array is constructed and advanced in phase with the mapping
object iterator. When advanced indexing is used to place values in an
array, a special "broadcasted" iterator is constructed from the object
being placed into the array so that it will only work if the values
used for setting have a shape that is "broadcastable" to the shape
implied by the indexing object.
Universal Functions
===================
.. index::
single: ufunc
Universal functions are callable objects that take :math:`N` inputs
and produce :math:`M` outputs by wrapping basic 1-D loops that work
element-by-element into full easy-to use functions that seamlessly
implement broadcasting, type-checking and buffered coercion, and
output-argument handling. New universal functions are normally created
in C, although there is a mechanism for creating ufuncs from Python
functions (:func:`frompyfunc`). The user must supply a 1-D loop that
implements the basic function taking the input scalar values and
placing the resulting scalars into the appropriate output slots as
explaine n implementation.
Setup
-----
Every ufunc calculation involves some overhead related to setting up
the calculation. The practical significance of this overhead is that
even though the actual calculation of the ufunc is very fast, you will
be able to write array and type-specific code that will work faster
for small arrays than the ufunc. In particular, using ufuncs to
perform many calculations on 0-D arrays will be slower than other
Python-based solutions (the silently-imported scalarmath module exists
precisely to give array scalars the look-and-feel of ufunc-based
calculations with significantly reduced overhead).
When a ufunc is called, many things must be done. The information
collected from these setup operations is stored in a loop-object. This
loop object is a C-structure (that could become a Python object but is
not initialized as such because it is only used internally). This loop
object has the layout needed to be used with PyArray_Broadcast so that
the broadcasting can be handled in the same way as it is handled in
other sections of code.
The first thing done is to look-up in the thread-specific global
dictionary the current values for the buffer-size, the error mask, and
the associated error object. The state of the error mask controls what
happens when an error-condiction is found. It should be noted that
checking of the hardware error flags is only performed after each 1-D
loop is executed. This means that if the input and output arrays are
contiguous and of the correct type so that a single 1-D loop is
performed, then the flags may not be checked until all elements of the
array have been calcluated. Looking up these values in a thread-
specific dictionary takes time which is easily ignored for all but
very small arrays.
After checking, the thread-specific global variables, the inputs are
evaluated to determine how the ufunc should proceed and the input and
output arrays are constructed if necessary. Any inputs which are not
arrays are converted to arrays (using context if necessary). Which of
the inputs are scalars (and therefore converted to 0-D arrays) is
noted.
Next, an appropriate 1-D loop is selected from the 1-D loops available
to the ufunc based on the input array types. This 1-D loop is selected
by trying to match the signature of the data-types of the inputs
against the available signatures. The signatures corresponding to
built-in types are stored in the types member of the ufunc structure.
The signatures corresponding to user-defined types are stored in a
linked-list of function-information with the head element stored as a
``CObject`` in the userloops dictionary keyed by the data-type number
(the first user-defined type in the argument list is used as the key).
The signatures are searched until a signature is found to which the
input arrays can all be cast safely (ignoring any scalar arguments
which are not allowed to determine the type of the result). The
implication of this search procedure is that "lesser types" should be
placed below "larger types" when the signatures are stored. If no 1-D
loop is found, then an error is reported. Otherwise, the argument_list
is updated with the stored signature --- in case casting is necessary
and to fix the output types assumed by the 1-D loop.
If the ufunc has 2 inputs and 1 output and the second input is an
Object array then a special-case check is performed so that
NotImplemented is returned if the second input is not an ndarray, has
the __array_priority\__ attribute, and has an __r{op}\__ special
method. In this way, Python is signaled to give the other object a
chance to complete the operation instead of using generic object-array
calculations. This allows (for example) sparse matrices to override
the multiplication operator 1-D loop.
For input arrays that are smaller than the specified buffer size,
copies are made of all non-contiguous, mis-aligned, or out-of-
byteorder arrays to ensure that for small arrays, a single-loop is
used. Then, array iterators are created for all the input arrays and
the resulting collection of iterators is broadcast to a single shape.
The output arguments (if any) are then processed and any missing
return arrays are constructed. If any provided output array doesn't
have the correct type (or is mis-aligned) and is smaller than the
buffer size, then a new output array is constructed with the special
UPDATEIFCOPY flag set so that when it is DECREF'd on completion of the
function, it's contents will be copied back into the output array.
Iterators for the output arguments are then processed.
Finally, the decision is made about how to execute the looping
mechanism to ensure that all elements of the input arrays are combined
to produce the output arrays of the correct type. The options for loop
execution are one-loop (for contiguous, aligned, and correct data-
type), strided-loop (for non-contiguous but still aligned and correct
data-type), and a buffered loop (for mis-aligned or incorrect data-
type situations). Depending on which execution method is called for,
the loop is then setup and computed.
Function call
-------------
This section describes how the basic universal function computation
loop is setup and executed for each of the three different kinds of
execution possibilities. If :cdata:`NPY_ALLOW_THREADS` is defined during
compilation, then the Python Global Interpreter Lock (GIL) is released
prior to calling all of these loops (as long as they don't involve
object arrays). It is re-acquired if necessary to handle error
conditions. The hardware error flags are checked only after the 1-D
loop is calcluated.
One Loop
^^^^^^^^
This is the simplest case of all. The ufunc is executed by calling the
underlying 1-D loop exactly once. This is possible only when we have
aligned data of the correct type (including byte-order) for both input
and output and all arrays have uniform strides (either contiguous,
0-D, or 1-D). In this case, the 1-D computational loop is called once
to compute the calculation for the entire array. Note that the
hardware error flags are only checked after the entire calculation is
complete.
Strided Loop
^^^^^^^^^^^^
When the input and output arrays are aligned and of the correct type,
but the striding is not uniform (non-contiguous and 2-D or larger),
then a second looping structure is employed for the calculation. This
approach converts all of the iterators for the input and output
arguments to iterate over all but the largest dimension. The inner
loop is then handled by the underlying 1-D computational loop. The
outer loop is a standard iterator loop on the converted iterators. The
hardware error flags are checked after each 1-D loop is completed.
Buffered Loop
^^^^^^^^^^^^^
This is the code that handles the situation whenever the input and/or
output arrays are either misaligned or of the wrong data-type
(including being byte-swapped) from what the underlying 1-D loop
expects. The arrays are also assumed to be non-contiguous. The code
works very much like the strided loop except for the inner 1-D loop is
modified so that pre-processing is performed on the inputs and post-
processing is performed on the outputs in bufsize chunks (where
bufsize is a user-settable parameter). The underlying 1-D
computational loop is called on data that is copied over (if it needs
to be). The setup code and the loop code is considerably more
complicated in this case because it has to handle:
- memory allocation of the temporary buffers
- deciding whether or not to use buffers on the input and output data
(mis-aligned and/or wrong data-type)
- copying and possibly casting data for any inputs or outputs for which
buffers are necessary.
- special-casing Object arrays so that reference counts are properly
handled when copies and/or casts are necessary.
- breaking up the inner 1-D loop into bufsize chunks (with a possible
remainder).
Again, the hardware error flags are checked at the end of each 1-D
loop.
Final output manipulation
-------------------------
Ufuncs allow other array-like classes to be passed seamlessly through
the interface in that inputs of a particular class will induce the
outputs to be of that same class. The mechanism by which this works is
the following. If any of the inputs are not ndarrays and define the
:obj:`__array_wrap__` method, then the class with the largest
:obj:`__array_priority__` attribute determines the type of all the
outputs (with the exception of any output arrays passed in). The
:obj:`__array_wrap__` method of the input array will be called with the
ndarray being returned from the ufunc as it's input. There are two
calling styles of the :obj:`__array_wrap__` function supported. The first
takes the ndarray as the first argument and a tuple of "context" as
the second argument. The context is (ufunc, arguments, output argument
number). This is the first call tried. If a TypeError occurs, then the
function is called with just the ndarray as the first argument.
Methods
-------
Their are three methods of ufuncs that require calculation similar to
the general-purpose ufuncs. These are reduce, accumulate, and
reduceat. Each of these methods requires a setup command followed by a
loop. There are four loop styles possible for the methods
corresponding to no-elements, one-element, strided-loop, and buffered-
loop. These are the same basic loop styles as implemented for the
general purpose function call except for the no-element and one-
element cases which are special-cases occurring when the input array
objects have 0 and 1 elements respectively.
Setup
^^^^^
The setup function for all three methods is ``construct_reduce``.
This function creates a reducing loop object and fills it with
parameters needed to complete the loop. All of the methods only work
on ufuncs that take 2-inputs and return 1 output. Therefore, the
underlying 1-D loop is selected assuming a signature of [ ``otype``,
``otype``, ``otype`` ] where ``otype`` is the requested reduction
data-type. The buffer size and error handling is then retrieved from
(per-thread) global storage. For small arrays that are mis-aligned or
have incorrect data-type, a copy is made so that the un-buffered
section of code is used. Then, the looping strategy is selected. If
there is 1 element or 0 elements in the array, then a simple looping
method is selected. If the array is not mis-aligned and has the
correct data-type, then strided looping is selected. Otherwise,
buffered looping must be performed. Looping parameters are then
established, and the return array is constructed. The output array is
of a different shape depending on whether the method is reduce,
accumulate, or reduceat. If an output array is already provided, then
it's shape is checked. If the output array is not C-contiguous,
aligned, and of the correct data type, then a temporary copy is made
with the UPDATEIFCOPY flag set. In this way, the methods will be able
to work with a well-behaved output array but the result will be copied
back into the true output array when the method computation is
complete. Finally, iterators are set up to loop over the correct axis
(depending on the value of axis provided to the method) and the setup
routine returns to the actual computation routine.
Reduce
^^^^^^
.. index::
triple: ufunc; methods; reduce
All of the ufunc methods use the same underlying 1-D computational
loops with input and output arguments adjusted so that the appropriate
reduction takes place. For example, the key to the functioning of
reduce is that the 1-D loop is called with the output and the second
input pointing to the same position in memory and both having a step-
size of 0. The first input is pointing to the input array with a step-
size given by the appropriate stride for the selected axis. In this
way, the operation performed is
.. math::
:nowrap:
\begin{align*}
o & = & i[0] \\
o & = & i[k]\textrm{<op>}o\quad k=1\ldots N
\end{align*}
where :math:`N+1` is the number of elements in the input, :math:`i`,
:math:`o` is the output, and :math:`i[k]` is the
:math:`k^{\textrm{th}}` element of :math:`i` along the selected axis.
This basic operations is repeated for arrays with greater than 1
dimension so that the reduction takes place for every 1-D sub-array
along the selected axis. An iterator with the selected dimension
removed handles this looping.
For buffered loops, care must be taken to copy and cast data before
the loop function is called because the underlying loop expects
aligned data of the correct data-type (including byte-order). The
buffered loop must handle this copying and casting prior to calling
the loop function on chunks no greater than the user-specified
bufsize.
Accumulate
^^^^^^^^^^
.. index::
triple: ufunc; methods; accumulate
The accumulate function is very similar to the reduce function in that
the output and the second input both point to the output. The
difference is that the second input points to memory one stride behind
the current output pointer. Thus, the operation performed is
.. math::
:nowrap:
\begin{align*}
o[0] & = & i[0] \\
o[k] & = & i[k]\textrm{<op>}o[k-1]\quad k=1\ldots N.
\end{align*}
The output has the same shape as the input and each 1-D loop operates
over :math:`N` elements when the shape in the selected axis is :math:`N+1`.
Again, buffered loops take care to copy and cast the data before
calling the underlying 1-D computational loop.
Reduceat
^^^^^^^^
.. index::
triple: ufunc; methods; reduceat
single: ufunc
The reduceat function is a generalization of both the reduce and
accumulate functions. It implements a reduce over ranges of the input
array specified by indices. The extra indices argument is checked to
be sure that every input is not too large for the input array along
the selected dimension before the loop calculations take place. The
loop implementation is handled using code that is very similar to the
reduce code repeated as many times as there are elements in the
indices input. In particular: the first input pointer passed to the
underlying 1-D computational loop points to the input array at the
correct location indicated by the index array. In addition, the output
pointer and the second input pointer passed to the underlying 1-D loop
point to the same position in memory. The size of the 1-D
computational loop is fixed to be the difference between the current
index and the next index (when the current index is the last index,
then the next index is assumed to be the length of the array along the
selected dimension). In this way, the 1-D loop will implement a reduce
over the specified indices.
Mis-aligned or a loop data-type that does not match the input and/or
output data-type is handled using buffered code where-in data is
copied to a temporary buffer and cast to the correct data-type if
necessary prior to calling the underlying 1-D function. The temporary
buffers are created in (element) sizes no bigger than the user
settable buffer-size value. Thus, the loop must be flexible enough to
call the underlying 1-D computational loop enough times to complete
the total calculation in chunks no bigger than the buffer-size.

View file

@ -1,9 +0,0 @@
***************
Numpy internals
***************
.. toctree::
internals.code-explanations
.. automodule:: numpy.doc.internals

View file

@ -1,462 +0,0 @@
.. currentmodule:: numpy.ma
.. _numpy.ma.constants:
Constants of the :mod:`numpy.ma` module
=======================================
In addition to the :class:`MaskedArray` class, the :mod:`numpy.ma` module
defines several constants.
.. data:: masked
The :attr:`masked` constant is a special case of :class:`MaskedArray`,
with a float datatype and a null shape. It is used to test whether a
specific entry of a masked array is masked, or to mask one or several
entries of a masked array::
>>> x = ma.array([1, 2, 3], mask=[0, 1, 0])
>>> x[1] is ma.masked
True
>>> x[-1] = ma.masked
>>> x
masked_array(data = [1 -- --],
mask = [False True True],
fill_value = 999999)
.. data:: nomask
Value indicating that a masked array has no invalid entry.
:attr:`nomask` is used internally to speed up computations when the mask
is not needed.
.. data:: masked_print_options
String used in lieu of missing data when a masked array is printed.
By default, this string is ``'--'``.
.. _maskedarray.baseclass:
The :class:`MaskedArray` class
==============================
.. class:: MaskedArray
A subclass of :class:`~numpy.ndarray` designed to manipulate numerical arrays with missing data.
An instance of :class:`MaskedArray` can be thought as the combination of several elements:
* The :attr:`~MaskedArray.data`, as a regular :class:`numpy.ndarray` of any shape or datatype (the data).
* A boolean :attr:`~numpy.ma.MaskedArray.mask` with the same shape as the data, where a ``True`` value indicates that the corresponding element of the data is invalid.
The special value :const:`nomask` is also acceptable for arrays without named fields, and indicates that no data is invalid.
* A :attr:`~numpy.ma.MaskedArray.fill_value`, a value that may be used to replace the invalid entries in order to return a standard :class:`numpy.ndarray`.
Attributes and properties of masked arrays
------------------------------------------
.. seealso:: :ref:`Array Attributes <arrays.ndarray.attributes>`
.. attribute:: MaskedArray.data
Returns the underlying data, as a view of the masked array.
If the underlying data is a subclass of :class:`numpy.ndarray`, it is
returned as such.
>>> x = ma.array(np.matrix([[1, 2], [3, 4]]), mask=[[0, 1], [1, 0]])
>>> x.data
matrix([[1, 2],
[3, 4]])
The type of the data can be accessed through the :attr:`baseclass`
attribute.
.. attribute:: MaskedArray.mask
Returns the underlying mask, as an array with the same shape and structure
as the data, but where all fields are atomically booleans.
A value of ``True`` indicates an invalid entry.
.. attribute:: MaskedArray.recordmask
Returns the mask of the array if it has no named fields. For structured
arrays, returns a ndarray of booleans where entries are ``True`` if **all**
the fields are masked, ``False`` otherwise::
>>> x = ma.array([(1, 1), (2, 2), (3, 3), (4, 4), (5, 5)],
... mask=[(0, 0), (1, 0), (1, 1), (0, 1), (0, 0)],
... dtype=[('a', int), ('b', int)])
>>> x.recordmask
array([False, False, True, False, False], dtype=bool)
.. attribute:: MaskedArray.fill_value
Returns the value used to fill the invalid entries of a masked array.
The value is either a scalar (if the masked array has no named fields),
or a 0-D ndarray with the same :attr:`dtype` as the masked array if it has
named fields.
The default filling value depends on the datatype of the array:
======== ========
datatype default
======== ========
bool True
int 999999
float 1.e20
complex 1.e20+0j
object '?'
string 'N/A'
======== ========
.. attribute:: MaskedArray.baseclass
Returns the class of the underlying data.
>>> x = ma.array(np.matrix([[1, 2], [3, 4]]), mask=[[0, 0], [1, 0]])
>>> x.baseclass
<class 'numpy.matrixlib.defmatrix.matrix'>
.. attribute:: MaskedArray.sharedmask
Returns whether the mask of the array is shared between several masked arrays.
If this is the case, any modification to the mask of one array will be
propagated to the others.
.. attribute:: MaskedArray.hardmask
Returns whether the mask is hard (``True``) or soft (``False``).
When the mask is hard, masked entries cannot be unmasked.
As :class:`MaskedArray` is a subclass of :class:`~numpy.ndarray`, a masked array also inherits all the attributes and properties of a :class:`~numpy.ndarray` instance.
.. autosummary::
:toctree: generated/
MaskedArray.base
MaskedArray.ctypes
MaskedArray.dtype
MaskedArray.flags
MaskedArray.itemsize
MaskedArray.nbytes
MaskedArray.ndim
MaskedArray.shape
MaskedArray.size
MaskedArray.strides
MaskedArray.imag
MaskedArray.real
MaskedArray.flat
MaskedArray.__array_priority__
:class:`MaskedArray` methods
============================
.. seealso:: :ref:`Array methods <array.ndarray.methods>`
Conversion
----------
.. autosummary::
:toctree: generated/
MaskedArray.__float__
MaskedArray.__hex__
MaskedArray.__int__
MaskedArray.__long__
MaskedArray.__oct__
MaskedArray.view
MaskedArray.astype
MaskedArray.byteswap
MaskedArray.compressed
MaskedArray.filled
MaskedArray.tofile
MaskedArray.toflex
MaskedArray.tolist
MaskedArray.torecords
MaskedArray.tostring
Shape manipulation
------------------
For reshape, resize, and transpose, the single tuple argument may be
replaced with ``n`` integers which will be interpreted as an n-tuple.
.. autosummary::
:toctree: generated/
MaskedArray.flatten
MaskedArray.ravel
MaskedArray.reshape
MaskedArray.resize
MaskedArray.squeeze
MaskedArray.swapaxes
MaskedArray.transpose
MaskedArray.T
Item selection and manipulation
-------------------------------
For array methods that take an *axis* keyword, it defaults to `None`.
If axis is *None*, then the array is treated as a 1-D array.
Any other value for *axis* represents the dimension along which
the operation should proceed.
.. autosummary::
:toctree: generated/
MaskedArray.argmax
MaskedArray.argmin
MaskedArray.argsort
MaskedArray.choose
MaskedArray.compress
MaskedArray.diagonal
MaskedArray.fill
MaskedArray.item
MaskedArray.nonzero
MaskedArray.put
MaskedArray.repeat
MaskedArray.searchsorted
MaskedArray.sort
MaskedArray.take
Pickling and copy
-----------------
.. autosummary::
:toctree: generated/
MaskedArray.copy
MaskedArray.dump
MaskedArray.dumps
Calculations
------------
.. autosummary::
:toctree: generated/
MaskedArray.all
MaskedArray.anom
MaskedArray.any
MaskedArray.clip
MaskedArray.conj
MaskedArray.conjugate
MaskedArray.cumprod
MaskedArray.cumsum
MaskedArray.max
MaskedArray.mean
MaskedArray.min
MaskedArray.prod
MaskedArray.product
MaskedArray.ptp
MaskedArray.round
MaskedArray.std
MaskedArray.sum
MaskedArray.trace
MaskedArray.var
Arithmetic and comparison operations
------------------------------------
.. index:: comparison, arithmetic, operation, operator
Comparison operators:
~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
MaskedArray.__lt__
MaskedArray.__le__
MaskedArray.__gt__
MaskedArray.__ge__
MaskedArray.__eq__
MaskedArray.__ne__
Truth value of an array (:func:`bool()`):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
MaskedArray.__nonzero__
Arithmetic:
~~~~~~~~~~~
.. autosummary::
:toctree: generated/
MaskedArray.__abs__
MaskedArray.__add__
MaskedArray.__radd__
MaskedArray.__sub__
MaskedArray.__rsub__
MaskedArray.__mul__
MaskedArray.__rmul__
MaskedArray.__div__
MaskedArray.__rdiv__
MaskedArray.__truediv__
MaskedArray.__rtruediv__
MaskedArray.__floordiv__
MaskedArray.__rfloordiv__
MaskedArray.__mod__
MaskedArray.__rmod__
MaskedArray.__divmod__
MaskedArray.__rdivmod__
MaskedArray.__pow__
MaskedArray.__rpow__
MaskedArray.__lshift__
MaskedArray.__rlshift__
MaskedArray.__rshift__
MaskedArray.__rrshift__
MaskedArray.__and__
MaskedArray.__rand__
MaskedArray.__or__
MaskedArray.__ror__
MaskedArray.__xor__
MaskedArray.__rxor__
Arithmetic, in-place:
~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
MaskedArray.__iadd__
MaskedArray.__isub__
MaskedArray.__imul__
MaskedArray.__idiv__
MaskedArray.__itruediv__
MaskedArray.__ifloordiv__
MaskedArray.__imod__
MaskedArray.__ipow__
MaskedArray.__ilshift__
MaskedArray.__irshift__
MaskedArray.__iand__
MaskedArray.__ior__
MaskedArray.__ixor__
Representation
--------------
.. autosummary::
:toctree: generated/
MaskedArray.__repr__
MaskedArray.__str__
MaskedArray.ids
MaskedArray.iscontiguous
Special methods
---------------
For standard library functions:
.. autosummary::
:toctree: generated/
MaskedArray.__copy__
MaskedArray.__deepcopy__
MaskedArray.__getstate__
MaskedArray.__reduce__
MaskedArray.__setstate__
Basic customization:
.. autosummary::
:toctree: generated/
MaskedArray.__new__
MaskedArray.__array__
MaskedArray.__array_wrap__
Container customization: (see :ref:`Indexing <arrays.indexing>`)
.. autosummary::
:toctree: generated/
MaskedArray.__len__
MaskedArray.__getitem__
MaskedArray.__setitem__
MaskedArray.__delitem__
MaskedArray.__getslice__
MaskedArray.__setslice__
MaskedArray.__contains__
Specific methods
----------------
Handling the mask
~~~~~~~~~~~~~~~~~
The following methods can be used to access information about the mask or to
manipulate the mask.
.. autosummary::
:toctree: generated/
MaskedArray.__setmask__
MaskedArray.harden_mask
MaskedArray.soften_mask
MaskedArray.unshare_mask
MaskedArray.shrink_mask
Handling the `fill_value`
~~~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
MaskedArray.get_fill_value
MaskedArray.set_fill_value
Counting the missing elements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
MaskedArray.count

View file

@ -1,499 +0,0 @@
.. currentmodule:: numpy.ma
.. _maskedarray.generic:
The :mod:`numpy.ma` module
==========================
Rationale
---------
Masked arrays are arrays that may have missing or invalid entries.
The :mod:`numpy.ma` module provides a nearly work-alike replacement for numpy
that supports data arrays with masks.
What is a masked array?
-----------------------
In many circumstances, datasets can be incomplete or tainted by the presence
of invalid data. For example, a sensor may have failed to record a data, or
recorded an invalid value. The :mod:`numpy.ma` module provides a convenient
way to address this issue, by introducing masked arrays.
A masked array is the combination of a standard :class:`numpy.ndarray` and a
mask. A mask is either :attr:`nomask`, indicating that no value of the
associated array is invalid, or an array of booleans that determines for each
element of the associated array whether the value is valid or not. When an
element of the mask is ``False``, the corresponding element of the associated
array is valid and is said to be unmasked. When an element of the mask is
``True``, the corresponding element of the associated array is said to be
masked (invalid).
The package ensures that masked entries are not used in computations.
As an illustration, let's consider the following dataset::
>>> import numpy as np
>>> import numpy.ma as ma
>>> x = np.array([1, 2, 3, -1, 5])
We wish to mark the fourth entry as invalid. The easiest is to create a masked
array::
>>> mx = ma.masked_array(x, mask=[0, 0, 0, 1, 0])
We can now compute the mean of the dataset, without taking the invalid data
into account::
>>> mx.mean()
2.75
The :mod:`numpy.ma` module
--------------------------
The main feature of the :mod:`numpy.ma` module is the :class:`MaskedArray`
class, which is a subclass of :class:`numpy.ndarray`. The class, its
attributes and methods are described in more details in the
:ref:`MaskedArray class <maskedarray.baseclass>` section.
The :mod:`numpy.ma` module can be used as an addition to :mod:`numpy`: ::
>>> import numpy as np
>>> import numpy.ma as ma
To create an array with the second element invalid, we would do::
>>> y = ma.array([1, 2, 3], mask = [0, 1, 0])
To create a masked array where all values close to 1.e20 are invalid, we would
do::
>>> z = masked_values([1.0, 1.e20, 3.0, 4.0], 1.e20)
For a complete discussion of creation methods for masked arrays please see
section :ref:`Constructing masked arrays <maskedarray.generic.constructing>`.
Using numpy.ma
==============
.. _maskedarray.generic.constructing:
Constructing masked arrays
--------------------------
There are several ways to construct a masked array.
* A first possibility is to directly invoke the :class:`MaskedArray` class.
* A second possibility is to use the two masked array constructors,
:func:`array` and :func:`masked_array`.
.. autosummary::
:toctree: generated/
array
masked_array
* A third option is to take the view of an existing array. In that case, the
mask of the view is set to :attr:`nomask` if the array has no named fields,
or an array of boolean with the same structure as the array otherwise.
>>> x = np.array([1, 2, 3])
>>> x.view(ma.MaskedArray)
masked_array(data = [1 2 3],
mask = False,
fill_value = 999999)
>>> x = np.array([(1, 1.), (2, 2.)], dtype=[('a',int), ('b', float)])
>>> x.view(ma.MaskedArray)
masked_array(data = [(1, 1.0) (2, 2.0)],
mask = [(False, False) (False, False)],
fill_value = (999999, 1e+20),
dtype = [('a', '<i4'), ('b', '<f8')])
* Yet another possibility is to use any of the following functions:
.. autosummary::
:toctree: generated/
asarray
asanyarray
fix_invalid
masked_equal
masked_greater
masked_greater_equal
masked_inside
masked_invalid
masked_less
masked_less_equal
masked_not_equal
masked_object
masked_outside
masked_values
masked_where
Accessing the data
------------------
The underlying data of a masked array can be accessed in several ways:
* through the :attr:`~MaskedArray.data` attribute. The output is a view of the
array as a :class:`numpy.ndarray` or one of its subclasses, depending on the
type of the underlying data at the masked array creation.
* through the :meth:`~MaskedArray.__array__` method. The output is then a
:class:`numpy.ndarray`.
* by directly taking a view of the masked array as a :class:`numpy.ndarray`
or one of its subclass (which is actually what using the
:attr:`~MaskedArray.data` attribute does).
* by using the :func:`getdata` function.
None of these methods is completely satisfactory if some entries have been
marked as invalid. As a general rule, where a representation of the array is
required without any masked entries, it is recommended to fill the array with
the :meth:`filled` method.
Accessing the mask
------------------
The mask of a masked array is accessible through its :attr:`~MaskedArray.mask`
attribute. We must keep in mind that a ``True`` entry in the mask indicates an
*invalid* data.
Another possibility is to use the :func:`getmask` and :func:`getmaskarray`
functions. :func:`getmask(x)` outputs the mask of ``x`` if ``x`` is a masked
array, and the special value :data:`nomask` otherwise. :func:`getmaskarray(x)`
outputs the mask of ``x`` if ``x`` is a masked array. If ``x`` has no invalid
entry or is not a masked array, the function outputs a boolean array of
``False`` with as many elements as ``x``.
Accessing only the valid entries
---------------------------------
To retrieve only the valid entries, we can use the inverse of the mask as an
index. The inverse of the mask can be calculated with the
:func:`numpy.logical_not` function or simply with the ``~`` operator::
>>> x = ma.array([[1, 2], [3, 4]], mask=[[0, 1], [1, 0]])
>>> x[~x.mask]
masked_array(data = [1 4],
mask = [False False],
fill_value = 999999)
Another way to retrieve the valid data is to use the :meth:`compressed`
method, which returns a one-dimensional :class:`~numpy.ndarray` (or one of its
subclasses, depending on the value of the :attr:`~MaskedArray.baseclass`
attribute)::
>>> x.compressed()
array([1, 4])
Note that the output of :meth:`compressed` is always 1D.
Modifying the mask
------------------
Masking an entry
~~~~~~~~~~~~~~~~
The recommended way to mark one or several specific entries of a masked array
as invalid is to assign the special value :attr:`masked` to them::
>>> x = ma.array([1, 2, 3])
>>> x[0] = ma.masked
>>> x
masked_array(data = [-- 2 3],
mask = [ True False False],
fill_value = 999999)
>>> y = ma.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> y[(0, 1, 2), (1, 2, 0)] = ma.masked
>>> y
masked_array(data =
[[1 -- 3]
[4 5 --]
[-- 8 9]],
mask =
[[False True False]
[False False True]
[ True False False]],
fill_value = 999999)
>>> z = ma.array([1, 2, 3, 4])
>>> z[:-2] = ma.masked
>>> z
masked_array(data = [-- -- 3 4],
mask = [ True True False False],
fill_value = 999999)
A second possibility is to modify the :attr:`~MaskedArray.mask` directly,
but this usage is discouraged.
.. note::
When creating a new masked array with a simple, non-structured datatype,
the mask is initially set to the special value :attr:`nomask`, that
corresponds roughly to the boolean ``False``. Trying to set an element of
:attr:`nomask` will fail with a :exc:`TypeError` exception, as a boolean
does not support item assignment.
All the entries of an array can be masked at once by assigning ``True`` to the
mask::
>>> x = ma.array([1, 2, 3], mask=[0, 0, 1])
>>> x.mask = True
>>> x
masked_array(data = [-- -- --],
mask = [ True True True],
fill_value = 999999)
Finally, specific entries can be masked and/or unmasked by assigning to the
mask a sequence of booleans::
>>> x = ma.array([1, 2, 3])
>>> x.mask = [0, 1, 0]
>>> x
masked_array(data = [1 -- 3],
mask = [False True False],
fill_value = 999999)
Unmasking an entry
~~~~~~~~~~~~~~~~~~
To unmask one or several specific entries, we can just assign one or several
new valid values to them::
>>> x = ma.array([1, 2, 3], mask=[0, 0, 1])
>>> x
masked_array(data = [1 2 --],
mask = [False False True],
fill_value = 999999)
>>> x[-1] = 5
>>> x
masked_array(data = [1 2 5],
mask = [False False False],
fill_value = 999999)
.. note::
Unmasking an entry by direct assignment will silently fail if the masked
array has a *hard* mask, as shown by the :attr:`hardmask` attribute. This
feature was introduced to prevent overwriting the mask. To force the
unmasking of an entry where the array has a hard mask, the mask must first
to be softened using the :meth:`soften_mask` method before the allocation.
It can be re-hardened with :meth:`harden_mask`::
>>> x = ma.array([1, 2, 3], mask=[0, 0, 1], hard_mask=True)
>>> x
masked_array(data = [1 2 --],
mask = [False False True],
fill_value = 999999)
>>> x[-1] = 5
>>> x
masked_array(data = [1 2 --],
mask = [False False True],
fill_value = 999999)
>>> x.soften_mask()
>>> x[-1] = 5
>>> x
masked_array(data = [1 2 --],
mask = [False False True],
fill_value = 999999)
>>> x.harden_mask()
To unmask all masked entries of a masked array (provided the mask isn't a hard
mask), the simplest solution is to assign the constant :attr:`nomask` to the
mask::
>>> x = ma.array([1, 2, 3], mask=[0, 0, 1])
>>> x
masked_array(data = [1 2 --],
mask = [False False True],
fill_value = 999999)
>>> x.mask = ma.nomask
>>> x
masked_array(data = [1 2 3],
mask = [False False False],
fill_value = 999999)
Indexing and slicing
--------------------
As a :class:`MaskedArray` is a subclass of :class:`numpy.ndarray`, it inherits
its mechanisms for indexing and slicing.
When accessing a single entry of a masked array with no named fields, the
output is either a scalar (if the corresponding entry of the mask is
``False``) or the special value :attr:`masked` (if the corresponding entry of
the mask is ``True``)::
>>> x = ma.array([1, 2, 3], mask=[0, 0, 1])
>>> x[0]
1
>>> x[-1]
masked_array(data = --,
mask = True,
fill_value = 1e+20)
>>> x[-1] is ma.masked
True
If the masked array has named fields, accessing a single entry returns a
:class:`numpy.void` object if none of the fields are masked, or a 0d masked
array with the same dtype as the initial array if at least one of the fields
is masked.
>>> y = ma.masked_array([(1,2), (3, 4)],
... mask=[(0, 0), (0, 1)],
... dtype=[('a', int), ('b', int)])
>>> y[0]
(1, 2)
>>> y[-1]
masked_array(data = (3, --),
mask = (False, True),
fill_value = (999999, 999999),
dtype = [('a', '<i4'), ('b', '<i4')])
When accessing a slice, the output is a masked array whose
:attr:`~MaskedArray.data` attribute is a view of the original data, and whose
mask is either :attr:`nomask` (if there was no invalid entries in the original
array) or a copy of the corresponding slice of the original mask. The copy is
required to avoid propagation of any modification of the mask to the original.
>>> x = ma.array([1, 2, 3, 4, 5], mask=[0, 1, 0, 0, 1])
>>> mx = x[:3]
>>> mx
masked_array(data = [1 -- 3],
mask = [False True False],
fill_value = 999999)
>>> mx[1] = -1
>>> mx
masked_array(data = [1 -1 3],
mask = [False True False],
fill_value = 999999)
>>> x.mask
array([False, True, False, False, True], dtype=bool)
>>> x.data
array([ 1, -1, 3, 4, 5])
Accessing a field of a masked array with structured datatype returns a
:class:`MaskedArray`.
Operations on masked arrays
---------------------------
Arithmetic and comparison operations are supported by masked arrays.
As much as possible, invalid entries of a masked array are not processed,
meaning that the corresponding :attr:`data` entries *should* be the same
before and after the operation.
.. warning::
We need to stress that this behavior may not be systematic, that masked
data may be affected by the operation in some cases and therefore users
should not rely on this data remaining unchanged.
The :mod:`numpy.ma` module comes with a specific implementation of most
ufuncs. Unary and binary functions that have a validity domain (such as
:func:`~numpy.log` or :func:`~numpy.divide`) return the :data:`masked`
constant whenever the input is masked or falls outside the validity domain::
>>> ma.log([-1, 0, 1, 2])
masked_array(data = [-- -- 0.0 0.69314718056],
mask = [ True True False False],
fill_value = 1e+20)
Masked arrays also support standard numpy ufuncs. The output is then a masked
array. The result of a unary ufunc is masked wherever the input is masked. The
result of a binary ufunc is masked wherever any of the input is masked. If the
ufunc also returns the optional context output (a 3-element tuple containing
the name of the ufunc, its arguments and its domain), the context is processed
and entries of the output masked array are masked wherever the corresponding
input fall outside the validity domain::
>>> x = ma.array([-1, 1, 0, 2, 3], mask=[0, 0, 0, 0, 1])
>>> np.log(x)
masked_array(data = [-- -- 0.0 0.69314718056 --],
mask = [ True True False False True],
fill_value = 1e+20)
Examples
========
Data with a given value representing missing data
-------------------------------------------------
Let's consider a list of elements, ``x``, where values of -9999. represent
missing data. We wish to compute the average value of the data and the vector
of anomalies (deviations from the average)::
>>> import numpy.ma as ma
>>> x = [0.,1.,-9999.,3.,4.]
>>> mx = ma.masked_values (x, -9999.)
>>> print mx.mean()
2.0
>>> print mx - mx.mean()
[-2.0 -1.0 -- 1.0 2.0]
>>> print mx.anom()
[-2.0 -1.0 -- 1.0 2.0]
Filling in the missing data
---------------------------
Suppose now that we wish to print that same data, but with the missing values
replaced by the average value.
>>> print mx.filled(mx.mean())
[ 0. 1. 2. 3. 4.]
Numerical operations
--------------------
Numerical operations can be easily performed without worrying about missing
values, dividing by zero, square roots of negative numbers, etc.::
>>> import numpy as np, numpy.ma as ma
>>> x = ma.array([1., -1., 3., 4., 5., 6.], mask=[0,0,0,0,1,0])
>>> y = ma.array([1., 2., 0., 4., 5., 6.], mask=[0,0,0,0,0,1])
>>> print np.sqrt(x/y)
[1.0 -- -- 1.0 -- --]
Four values of the output are invalid: the first one comes from taking the
square root of a negative number, the second from the division by zero, and
the last two where the inputs were masked.
Ignoring extreme values
-----------------------
Let's consider an array ``d`` of random floats between 0 and 1. We wish to
compute the average of the values of ``d`` while ignoring any data outside
the range ``[0.1, 0.9]``::
>>> print ma.masked_outside(d, 0.1, 0.9).mean()

View file

@ -1,19 +0,0 @@
.. _maskedarray:
*************
Masked arrays
*************
Masked arrays are arrays that may have missing or invalid entries.
The :mod:`numpy.ma` module provides a nearly work-alike replacement for numpy
that supports data arrays with masks.
.. index::
single: masked arrays
.. toctree::
:maxdepth: 2
maskedarray.generic
maskedarray.baseclass
routines.ma

View file

@ -1,103 +0,0 @@
.. _routines.array-creation:
Array creation routines
=======================
.. seealso:: :ref:`Array creation <arrays.creation>`
.. currentmodule:: numpy
Ones and zeros
--------------
.. autosummary::
:toctree: generated/
empty
empty_like
eye
identity
ones
ones_like
zeros
zeros_like
From existing data
------------------
.. autosummary::
:toctree: generated/
array
asarray
asanyarray
ascontiguousarray
asmatrix
copy
frombuffer
fromfile
fromfunction
fromiter
fromstring
loadtxt
.. _routines.array-creation.rec:
Creating record arrays (:mod:`numpy.rec`)
-----------------------------------------
.. note:: :mod:`numpy.rec` is the preferred alias for
:mod:`numpy.core.records`.
.. autosummary::
:toctree: generated/
core.records.array
core.records.fromarrays
core.records.fromrecords
core.records.fromstring
core.records.fromfile
.. _routines.array-creation.char:
Creating character arrays (:mod:`numpy.char`)
---------------------------------------------
.. note:: :mod:`numpy.char` is the preferred alias for
:mod:`numpy.core.defchararray`.
.. autosummary::
:toctree: generated/
core.defchararray.array
core.defchararray.asarray
Numerical ranges
----------------
.. autosummary::
:toctree: generated/
arange
linspace
logspace
meshgrid
mgrid
ogrid
Building matrices
-----------------
.. autosummary::
:toctree: generated/
diag
diagflat
tri
tril
triu
vander
The Matrix class
----------------
.. autosummary::
:toctree: generated/
mat
bmat

View file

@ -1,104 +0,0 @@
Array manipulation routines
***************************
.. currentmodule:: numpy
Changing array shape
====================
.. autosummary::
:toctree: generated/
reshape
ravel
ndarray.flat
ndarray.flatten
Transpose-like operations
=========================
.. autosummary::
:toctree: generated/
rollaxis
swapaxes
ndarray.T
transpose
Changing number of dimensions
=============================
.. autosummary::
:toctree: generated/
atleast_1d
atleast_2d
atleast_3d
broadcast
broadcast_arrays
expand_dims
squeeze
Changing kind of array
======================
.. autosummary::
:toctree: generated/
asarray
asanyarray
asmatrix
asfarray
asfortranarray
asscalar
require
Joining arrays
==============
.. autosummary::
:toctree: generated/
column_stack
concatenate
dstack
hstack
vstack
Splitting arrays
================
.. autosummary::
:toctree: generated/
array_split
dsplit
hsplit
split
vsplit
Tiling arrays
=============
.. autosummary::
:toctree: generated/
tile
repeat
Adding and removing elements
============================
.. autosummary::
:toctree: generated/
delete
insert
append
resize
trim_zeros
unique
Rearranging elements
====================
.. autosummary::
:toctree: generated/
fliplr
flipud
reshape
roll
rot90

View file

@ -1,31 +0,0 @@
Binary operations
*****************
.. currentmodule:: numpy
Elementwise bit operations
--------------------------
.. autosummary::
:toctree: generated/
bitwise_and
bitwise_or
bitwise_xor
invert
left_shift
right_shift
Bit packing
-----------
.. autosummary::
:toctree: generated/
packbits
unpackbits
Output formatting
-----------------
.. autosummary::
:toctree: generated/
binary_repr

View file

@ -1,88 +0,0 @@
String operations
*****************
.. currentmodule:: numpy.core.defchararray
This module provides a set of vectorized string operations for arrays
of type `numpy.string_` or `numpy.unicode_`. All of them are based on
the string methods in the Python standard library.
String operations
-----------------
.. autosummary::
:toctree: generated/
add
multiply
mod
capitalize
center
decode
encode
join
ljust
lower
lstrip
partition
replace
rjust
rpartition
rsplit
rstrip
split
splitlines
strip
swapcase
title
translate
upper
zfill
Comparison
----------
Unlike the standard numpy comparison operators, the ones in the `char`
module strip trailing whitespace characters before performing the
comparison.
.. autosummary::
:toctree: generated/
equal
not_equal
greater_equal
less_equal
greater
less
String information
------------------
.. autosummary::
:toctree: generated/
count
len
find
index
isalpha
isdecimal
isdigit
islower
isnumeric
isspace
istitle
isupper
rfind
rindex
startswith
Convenience class
-----------------
.. autosummary::
:toctree: generated/
chararray

View file

@ -1,11 +0,0 @@
***********************************************************
C-Types Foreign Function Interface (:mod:`numpy.ctypeslib`)
***********************************************************
.. currentmodule:: numpy.ctypeslib
.. autofunction:: as_array
.. autofunction:: as_ctypes
.. autofunction:: ctypes_load_library
.. autofunction:: load_library
.. autofunction:: ndpointer

View file

@ -1,52 +0,0 @@
.. _routines.dtype:
Data type routines
==================
.. currentmodule:: numpy
.. autosummary::
:toctree: generated/
can_cast
common_type
obj2sctype
Creating data types
-------------------
.. autosummary::
:toctree: generated/
dtype
format_parser
Data type information
---------------------
.. autosummary::
:toctree: generated/
finfo
iinfo
MachAr
Data type testing
-----------------
.. autosummary::
:toctree: generated/
issctype
issubdtype
issubsctype
issubclass_
find_common_type
Miscellaneous
-------------
.. autosummary::
:toctree: generated/
typename
sctype2char
mintypecode

View file

@ -1,48 +0,0 @@
Optionally Scipy-accelerated routines (:mod:`numpy.dual`)
*********************************************************
.. automodule:: numpy.dual
Linear algebra
--------------
.. currentmodule:: numpy.linalg
.. autosummary::
cholesky
det
eig
eigh
eigvals
eigvalsh
inv
lstsq
norm
pinv
solve
svd
FFT
---
.. currentmodule:: numpy.fft
.. autosummary::
fft
fft2
fftn
ifft
ifft2
ifftn
Other
-----
.. currentmodule:: numpy
.. autosummary::
i0

View file

@ -1,10 +0,0 @@
Mathematical functions with automatic domain (:mod:`numpy.emath`)
***********************************************************************
.. currentmodule:: numpy
.. note:: :mod:`numpy.emath` is a preferred alias for :mod:`numpy.lib.scimath`,
available after :mod:`numpy` is imported.
.. automodule:: numpy.lib.scimath

View file

@ -1,25 +0,0 @@
Floating point error handling
*****************************
.. currentmodule:: numpy
Setting and getting error handling
----------------------------------
.. autosummary::
:toctree: generated/
seterr
geterr
seterrcall
geterrcall
errstate
Internal functions
------------------
.. autosummary::
:toctree: generated/
seterrobj
geterrobj

View file

@ -1,2 +0,0 @@
.. _routines.fft:
.. automodule:: numpy.fft

View file

@ -1,21 +0,0 @@
Financial functions
*******************
.. currentmodule:: numpy
Simple financial functions
--------------------------
.. autosummary::
:toctree: generated/
fv
pv
npv
pmt
ppmt
ipmt
irr
mirr
nper
rate

View file

@ -1,13 +0,0 @@
Functional programming
**********************
.. currentmodule:: numpy
.. autosummary::
:toctree: generated/
apply_along_axis
apply_over_axes
vectorize
frompyfunc
piecewise

View file

@ -1,24 +0,0 @@
.. _routines.help:
Numpy-specific help functions
=============================
.. currentmodule:: numpy
Finding help
------------
.. autosummary::
:toctree: generated/
lookfor
Reading help
------------
.. autosummary::
:toctree: generated/
info
source

View file

@ -1,61 +0,0 @@
.. _routines.indexing:
Indexing routines
=================
.. seealso:: :ref:`Indexing <arrays.indexing>`
.. currentmodule:: numpy
Generating index arrays
-----------------------
.. autosummary::
:toctree: generated/
c_
r_
s_
nonzero
where
indices
ix_
ogrid
unravel_index
diag_indices
diag_indices_from
mask_indices
tril_indices
tril_indices_from
triu_indices
triu_indices_from
Indexing-like operations
------------------------
.. autosummary::
:toctree: generated/
take
choose
compress
diag
diagonal
select
Inserting data into arrays
--------------------------
.. autosummary::
:toctree: generated/
place
put
putmask
fill_diagonal
Iterating over arrays
---------------------
.. autosummary::
:toctree: generated/
ndenumerate
ndindex
flatiter

View file

@ -1,65 +0,0 @@
Input and output
****************
.. currentmodule:: numpy
NPZ files
---------
.. autosummary::
:toctree: generated/
load
save
savez
Text files
----------
.. autosummary::
:toctree: generated/
loadtxt
savetxt
genfromtxt
fromregex
fromstring
ndarray.tofile
ndarray.tolist
String formatting
-----------------
.. autosummary::
:toctree: generated/
array_repr
array_str
Memory mapping files
--------------------
.. autosummary::
:toctree: generated/
memmap
Text formatting options
-----------------------
.. autosummary::
:toctree: generated/
set_printoptions
get_printoptions
set_string_function
Base-n representations
----------------------
.. autosummary::
:toctree: generated/
binary_repr
base_repr
Data sources
------------
.. autosummary::
:toctree: generated/
DataSource

View file

@ -1,68 +0,0 @@
.. _routines.linalg:
Linear algebra (:mod:`numpy.linalg`)
************************************
.. currentmodule:: numpy
Matrix and vector products
--------------------------
.. autosummary::
:toctree: generated/
dot
vdot
inner
outer
tensordot
linalg.matrix_power
kron
Decompositions
--------------
.. autosummary::
:toctree: generated/
linalg.cholesky
linalg.qr
linalg.svd
Matrix eigenvalues
------------------
.. autosummary::
:toctree: generated/
linalg.eig
linalg.eigh
linalg.eigvals
linalg.eigvalsh
Norms and other numbers
-----------------------
.. autosummary::
:toctree: generated/
linalg.norm
linalg.cond
linalg.det
linalg.slogdet
trace
Solving equations and inverting matrices
----------------------------------------
.. autosummary::
:toctree: generated/
linalg.solve
linalg.tensorsolve
linalg.lstsq
linalg.inv
linalg.pinv
linalg.tensorinv
Exceptions
----------
.. autosummary::
:toctree: generated/
linalg.LinAlgError

View file

@ -1,64 +0,0 @@
Logic functions
***************
.. currentmodule:: numpy
Truth value testing
-------------------
.. autosummary::
:toctree: generated/
all
any
Array contents
--------------
.. autosummary::
:toctree: generated/
isfinite
isinf
isnan
isneginf
isposinf
Array type testing
------------------
.. autosummary::
:toctree: generated/
iscomplex
iscomplexobj
isfortran
isreal
isrealobj
isscalar
Logical operations
------------------
.. autosummary::
:toctree: generated/
logical_and
logical_or
logical_not
logical_xor
Comparison
----------
.. autosummary::
:toctree: generated/
allclose
array_equal
array_equiv
.. autosummary::
:toctree: generated/
greater
greater_equal
less
less_equal
equal
not_equal

View file

@ -1,404 +0,0 @@
.. _routines.ma:
Masked array operations
***********************
.. currentmodule:: numpy
Constants
=========
.. autosummary::
:toctree: generated/
ma.MaskType
Creation
========
From existing data
~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.masked_array
ma.array
ma.copy
ma.frombuffer
ma.fromfunction
ma.MaskedArray.copy
Ones and zeros
~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.empty
ma.empty_like
ma.masked_all
ma.masked_all_like
ma.ones
ma.zeros
_____
Inspecting the array
====================
.. autosummary::
:toctree: generated/
ma.all
ma.any
ma.count
ma.count_masked
ma.getmask
ma.getmaskarray
ma.getdata
ma.nonzero
ma.shape
ma.size
ma.MaskedArray.data
ma.MaskedArray.mask
ma.MaskedArray.recordmask
ma.MaskedArray.all
ma.MaskedArray.any
ma.MaskedArray.count
ma.MaskedArray.nonzero
ma.shape
ma.size
_____
Manipulating a MaskedArray
==========================
Changing the shape
~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.ravel
ma.reshape
ma.resize
ma.MaskedArray.flatten
ma.MaskedArray.ravel
ma.MaskedArray.reshape
ma.MaskedArray.resize
Modifying axes
~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.swapaxes
ma.transpose
ma.MaskedArray.swapaxes
ma.MaskedArray.transpose
Changing the number of dimensions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.atleast_1d
ma.atleast_2d
ma.atleast_3d
ma.expand_dims
ma.squeeze
ma.MaskedArray.squeeze
ma.column_stack
ma.concatenate
ma.dstack
ma.hstack
ma.hsplit
ma.mr_
ma.row_stack
ma.vstack
Joining arrays
~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.column_stack
ma.concatenate
ma.dstack
ma.hstack
ma.vstack
_____
Operations on masks
===================
Creating a mask
~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.make_mask
ma.make_mask_none
ma.mask_or
ma.make_mask_descr
Accessing a mask
~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.getmask
ma.getmaskarray
ma.masked_array.mask
Finding masked data
~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.flatnotmasked_contiguous
ma.flatnotmasked_edges
ma.notmasked_contiguous
ma.notmasked_edges
Modifying a mask
~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.mask_cols
ma.mask_or
ma.mask_rowcols
ma.mask_rows
ma.harden_mask
ma.soften_mask
ma.MaskedArray.harden_mask
ma.MaskedArray.soften_mask
ma.MaskedArray.shrink_mask
ma.MaskedArray.unshare_mask
_____
Conversion operations
======================
> to a masked array
~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.asarray
ma.asanyarray
ma.fix_invalid
ma.masked_equal
ma.masked_greater
ma.masked_greater_equal
ma.masked_inside
ma.masked_invalid
ma.masked_less
ma.masked_less_equal
ma.masked_not_equal
ma.masked_object
ma.masked_outside
ma.masked_values
ma.masked_where
> to a ndarray
~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.compress_cols
ma.compress_rowcols
ma.compress_rows
ma.compressed
ma.filled
ma.MaskedArray.compressed
ma.MaskedArray.filled
> to another object
~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.MaskedArray.tofile
ma.MaskedArray.tolist
ma.MaskedArray.torecords
ma.MaskedArray.tostring
Pickling and unpickling
~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.dump
ma.dumps
ma.load
ma.loads
Filling a masked array
~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.common_fill_value
ma.default_fill_value
ma.maximum_fill_value
ma.maximum_fill_value
ma.set_fill_value
ma.MaskedArray.get_fill_value
ma.MaskedArray.set_fill_value
ma.MaskedArray.fill_value
_____
Masked arrays arithmetics
=========================
Arithmetics
~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.anom
ma.anomalies
ma.average
ma.conjugate
ma.corrcoef
ma.cov
ma.cumsum
ma.cumprod
ma.mean
ma.median
ma.power
ma.prod
ma.std
ma.sum
ma.var
ma.MaskedArray.anom
ma.MaskedArray.cumprod
ma.MaskedArray.cumsum
ma.MaskedArray.mean
ma.MaskedArray.prod
ma.MaskedArray.std
ma.MaskedArray.sum
ma.MaskedArray.var
Minimum/maximum
~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.argmax
ma.argmin
ma.max
ma.min
ma.ptp
ma.MaskedArray.argmax
ma.MaskedArray.argmin
ma.MaskedArray.max
ma.MaskedArray.min
ma.MaskedArray.ptp
Sorting
~~~~~~~
.. autosummary::
:toctree: generated/
ma.argsort
ma.sort
ma.MaskedArray.argsort
ma.MaskedArray.sort
Algebra
~~~~~~~
.. autosummary::
:toctree: generated/
ma.diag
ma.dot
ma.identity
ma.inner
ma.innerproduct
ma.outer
ma.outerproduct
ma.trace
ma.transpose
ma.MaskedArray.trace
ma.MaskedArray.transpose
Polynomial fit
~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.vander
ma.polyfit
Clipping and rounding
~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.around
ma.clip
ma.round
ma.MaskedArray.clip
ma.MaskedArray.round
Miscellanea
~~~~~~~~~~~
.. autosummary::
:toctree: generated/
ma.allequal
ma.allclose
ma.apply_along_axis
ma.arange
ma.choose
ma.ediff1d
ma.indices
ma.where

View file

@ -1,150 +0,0 @@
Mathematical functions
**********************
.. currentmodule:: numpy
Trigonometric functions
-----------------------
.. autosummary::
:toctree: generated/
sin
cos
tan
arcsin
arccos
arctan
hypot
arctan2
degrees
radians
unwrap
deg2rad
rad2deg
Hyperbolic functions
--------------------
.. autosummary::
:toctree: generated/
sinh
cosh
tanh
arcsinh
arccosh
arctanh
Rounding
--------
.. autosummary::
:toctree: generated/
around
round_
rint
fix
floor
ceil
trunc
Sums, products, differences
---------------------------
.. autosummary::
:toctree: generated/
prod
sum
nansum
cumprod
cumsum
diff
ediff1d
gradient
cross
trapz
Exponents and logarithms
------------------------
.. autosummary::
:toctree: generated/
exp
expm1
exp2
log
log10
log2
log1p
logaddexp
logaddexp2
Other special functions
-----------------------
.. autosummary::
:toctree: generated/
i0
sinc
Floating point routines
-----------------------
.. autosummary::
:toctree: generated/
signbit
copysign
frexp
ldexp
Arithmetic operations
---------------------
.. autosummary::
:toctree: generated/
add
reciprocal
negative
multiply
divide
power
subtract
true_divide
floor_divide
fmod
mod
modf
remainder
Handling complex numbers
------------------------
.. autosummary::
:toctree: generated/
angle
real
imag
conj
Miscellaneous
-------------
.. autosummary::
:toctree: generated/
convolve
clip
sqrt
square
absolute
fabs
sign
maximum
minimum
nan_to_num
real_if_close
interp

View file

@ -1,11 +0,0 @@
Matrix library (:mod:`numpy.matlib`)
************************************
.. currentmodule:: numpy
This module contains all functions in the :mod:`numpy` namespace, with
the following replacement functions that return :class:`matrices
<matrix>` instead of :class:`ndarrays <ndarray>`.
.. automodule:: numpy.matlib

View file

@ -1,6 +0,0 @@
**********************************************
Numarray compatibility (:mod:`numpy.numarray`)
**********************************************
.. automodule:: numpy.numarray

View file

@ -1,8 +0,0 @@
***************************************************
Old Numeric compatibility (:mod:`numpy.oldnumeric`)
***************************************************
.. currentmodule:: numpy
.. automodule:: numpy.oldnumeric

View file

@ -1,24 +0,0 @@
Miscellaneous routines
**********************
.. toctree::
.. currentmodule:: numpy
Buffer objects
--------------
.. autosummary::
:toctree: generated/
getbuffer
newbuffer
Performance tuning
------------------
.. autosummary::
:toctree: generated/
alterdot
restoredot
setbufsize
getbufsize

View file

@ -1,46 +0,0 @@
Polynomials
***********
.. currentmodule:: numpy
Basics
------
.. autosummary::
:toctree: generated/
poly1d
polyval
poly
roots
Fitting
-------
.. autosummary::
:toctree: generated/
polyfit
Calculus
--------
.. autosummary::
:toctree: generated/
polyder
polyint
Arithmetic
----------
.. autosummary::
:toctree: generated/
polyadd
polydiv
polymul
polysub
Warnings
--------
.. autosummary::
:toctree: generated/
RankWarning

View file

@ -1,77 +0,0 @@
.. _routines.random:
Random sampling (:mod:`numpy.random`)
*************************************
.. currentmodule:: numpy.random
Simple random data
==================
.. autosummary::
:toctree: generated/
rand
randn
randint
random_integers
random_sample
bytes
Permutations
============
.. autosummary::
:toctree: generated/
shuffle
permutation
Distributions
=============
.. autosummary::
:toctree: generated/
beta
binomial
chisquare
mtrand.dirichlet
exponential
f
gamma
geometric
gumbel
hypergeometric
laplace
logistic
lognormal
logseries
multinomial
multivariate_normal
negative_binomial
noncentral_chisquare
noncentral_f
normal
pareto
poisson
power
rayleigh
standard_cauchy
standard_exponential
standard_gamma
standard_normal
standard_t
triangular
uniform
vonmises
wald
weibull
zipf
Random generator
================
.. autosummary::
:toctree: generated/
mtrand.RandomState
seed
get_state
set_state

View file

@ -1,47 +0,0 @@
********
Routines
********
In this chapter routine docstrings are presented, grouped by functionality.
Many docstrings contain example code, which demonstrates basic usage
of the routine. The examples assume that NumPy is imported with::
>>> import numpy as np
A convenient way to execute examples is the ``%doctest_mode`` mode of
IPython, which allows for pasting of multi-line examples and preserves
indentation.
.. toctree::
:maxdepth: 2
routines.array-creation
routines.array-manipulation
routines.indexing
routines.dtype
routines.io
routines.fft
routines.linalg
routines.random
routines.sort
routines.logic
routines.bitwise
routines.statistics
routines.math
routines.functional
routines.poly
routines.financial
routines.set
routines.window
routines.err
routines.ma
routines.help
routines.other
routines.testing
routines.emath
routines.matlib
routines.dual
routines.numarray
routines.oldnumeric
routines.ctypeslib
routines.char

View file

@ -1,22 +0,0 @@
Set routines
============
.. currentmodule:: numpy
Making proper sets
------------------
.. autosummary::
:toctree: generated/
unique
Boolean operations
------------------
.. autosummary::
:toctree: generated/
in1d
intersect1d
setdiff1d
setxor1d
union1d

View file

@ -1,32 +0,0 @@
Sorting and searching
=====================
.. currentmodule:: numpy
Sorting
-------
.. autosummary::
:toctree: generated/
sort
lexsort
argsort
ndarray.sort
msort
sort_complex
Searching
---------
.. autosummary::
:toctree: generated/
argmax
nanargmax
argmin
nanargmin
argwhere
nonzero
flatnonzero
where
searchsorted
extract

View file

@ -1,51 +0,0 @@
Statistics
==========
.. currentmodule:: numpy
Extremal values
---------------
.. autosummary::
:toctree: generated/
amin
amax
nanmax
nanmin
ptp
Averages and variances
----------------------
.. autosummary::
:toctree: generated/
average
mean
median
std
var
Correlating
-----------
.. autosummary::
:toctree: generated/
corrcoef
correlate
cov
Histograms
----------
.. autosummary::
:toctree: generated/
histogram
histogram2d
histogramdd
bincount
digitize

View file

@ -1,48 +0,0 @@
Test Support (:mod:`numpy.testing`)
===================================
.. currentmodule:: numpy.testing
Common test support for all numpy test scripts.
This single module should provide all the common functionality for numpy
tests in a single location, so that test scripts can just import it and
work right away.
Asserts
=======
.. autosummary::
:toctree: generated/
assert_almost_equal
assert_approx_equal
assert_array_almost_equal
assert_array_equal
assert_array_less
assert_equal
assert_raises
assert_warns
assert_string_equal
Decorators
----------
.. autosummary::
:toctree: generated/
decorators.deprecated
decorators.knownfailureif
decorators.setastest
decorators.skipif
decorators.slow
decorate_methods
Test Running
------------
.. autosummary::
:toctree: generated/
Tester
run_module_suite
rundocs

View file

@ -1,16 +0,0 @@
Window functions
================
.. currentmodule:: numpy
Various windows
---------------
.. autosummary::
:toctree: generated/
bartlett
blackman
hamming
hanning
kaiser

View file

@ -1,568 +0,0 @@
.. sectionauthor:: adapted from "Guide to Numpy" by Travis E. Oliphant
.. _ufuncs:
************************************
Universal functions (:class:`ufunc`)
************************************
.. note: XXX: section might need to be made more reference-guideish...
.. currentmodule:: numpy
.. index: ufunc, universal function, arithmetic, operation
A universal function (or :term:`ufunc` for short) is a function that
operates on :class:`ndarrays <ndarray>` in an element-by-element fashion,
supporting :ref:`array broadcasting <ufuncs.broadcasting>`, :ref:`type
casting <ufuncs.casting>`, and several other standard features. That
is, a ufunc is a ":term:`vectorized`" wrapper for a function that
takes a fixed number of scalar inputs and produces a fixed number of
scalar outputs.
In Numpy, universal functions are instances of the
:class:`numpy.ufunc` class. Many of the built-in functions are
implemented in compiled C code, but :class:`ufunc` instances can also
be produced using the :func:`frompyfunc` factory function.
.. _ufuncs.broadcasting:
Broadcasting
============
.. index:: broadcasting
Each universal function takes array inputs and produces array outputs
by performing the core function element-wise on the inputs. Standard
broadcasting rules are applied so that inputs not sharing exactly the
same shapes can still be usefully operated on. Broadcasting can be
understood by four rules:
1. All input arrays with :attr:`ndim <ndarray.ndim>` smaller than the
input array of largest :attr:`ndim <ndarray.ndim>`, have 1's
prepended to their shapes.
2. The size in each dimension of the output shape is the maximum of all
the input sizes in that dimension.
3. An input can be used in the calculation if its size in a particular
dimension either matches the output size in that dimension, or has
value exactly 1.
4. If an input has a dimension size of 1 in its shape, the first data
entry in that dimension will be used for all calculations along
that dimension. In other words, the stepping machinery of the
:term:`ufunc` will simply not step along that dimension (the
:term:`stride` will be 0 for that dimension).
Broadcasting is used throughout NumPy to decide how to handle
disparately shaped arrays; for example, all arithmetic operations (``+``,
``-``, ``*``, ...) between :class:`ndarrays <ndarray>` broadcast the
arrays before operation.
.. _arrays.broadcasting.broadcastable:
.. index:: broadcastable
A set of arrays is called ":term:`broadcastable`" to the same shape if
the above rules produce a valid result, *i.e.*, one of the following
is true:
1. The arrays all have exactly the same shape.
2. The arrays all have the same number of dimensions and the length of
each dimensions is either a common length or 1.
3. The arrays that have too few dimensions can have their shapes prepended
with a dimension of length 1 to satisfy property 2.
.. admonition:: Example
If ``a.shape`` is (5,1), ``b.shape`` is (1,6), ``c.shape`` is (6,)
and ``d.shape`` is () so that *d* is a scalar, then *a*, *b*, *c*,
and *d* are all broadcastable to dimension (5,6); and
- *a* acts like a (5,6) array where ``a[:,0]`` is broadcast to the other
columns,
- *b* acts like a (5,6) array where ``b[0,:]`` is broadcast
to the other rows,
- *c* acts like a (1,6) array and therefore like a (5,6) array
where ``c[:]`` is broadcast to every row, and finally,
- *d* acts like a (5,6) array where the single value is repeated.
.. _ufuncs.output-type:
Output type determination
=========================
The output of the ufunc (and its methods) is not necessarily an
:class:`ndarray`, if all input arguments are not :class:`ndarrays <ndarray>`.
All output arrays will be passed to the :obj:`__array_prepare__` and
:obj:`__array_wrap__` methods of the input (besides
:class:`ndarrays <ndarray>`, and scalars) that defines it **and** has
the highest :obj:`__array_priority__` of any other input to the
universal function. The default :obj:`__array_priority__` of the
ndarray is 0.0, and the default :obj:`__array_priority__` of a subtype
is 1.0. Matrices have :obj:`__array_priority__` equal to 10.0.
All ufuncs can also take output arguments. If necessary, output will
be cast to the data-type(s) of the provided output array(s). If a class
with an :obj:`__array__` method is used for the output, results will be
written to the object returned by :obj:`__array__`. Then, if the class
also has an :obj:`__array_prepare__` method, it is called so metadata
may be determined based on the context of the ufunc (the context
consisting of the ufunc itself, the arguments passed to the ufunc, and
the ufunc domain.) The array object returned by
:obj:`__array_prepare__` is passed to the ufunc for computation.
Finally, if the class also has an :obj:`__array_wrap__` method, the returned
:class:`ndarray` result will be passed to that method just before
passing control back to the caller.
Use of internal buffers
=======================
.. index:: buffers
Internally, buffers are used for misaligned data, swapped data, and
data that has to be converted from one data type to another. The size
of internal buffers is settable on a per-thread basis. There can
be up to :math:`2 (n_{\mathrm{inputs}} + n_{\mathrm{outputs}})`
buffers of the specified size created to handle the data from all the
inputs and outputs of a ufunc. The default size of a buffer is
10,000 elements. Whenever buffer-based calculation would be needed,
but all input arrays are smaller than the buffer size, those
misbehaved or incorrectly-typed arrays will be copied before the
calculation proceeds. Adjusting the size of the buffer may therefore
alter the speed at which ufunc calculations of various sorts are
completed. A simple interface for setting this variable is accessible
using the function
.. autosummary::
:toctree: generated/
setbufsize
Error handling
==============
.. index:: error handling
Universal functions can trip special floating-point status registers
in your hardware (such as divide-by-zero). If available on your
platform, these registers will be regularly checked during
calculation. Error handling is controlled on a per-thread basis,
and can be configured using the functions
.. autosummary::
:toctree: generated/
seterr
seterrcall
.. _ufuncs.casting:
Casting Rules
=============
.. index::
pair: ufunc; casting rules
At the core of every ufunc is a one-dimensional strided loop that
implements the actual function for a specific type combination. When a
ufunc is created, it is given a static list of inner loops and a
corresponding list of type signatures over which the ufunc operates.
The ufunc machinery uses this list to determine which inner loop to
use for a particular case. You can inspect the :attr:`.types
<ufunc.types>` attribute for a particular ufunc to see which type
combinations have a defined inner loop and which output type they
produce (:ref:`character codes <arrays.scalars.character-codes>` are used
in said output for brevity).
Casting must be done on one or more of the inputs whenever the ufunc
does not have a core loop implementation for the input types provided.
If an implementation for the input types cannot be found, then the
algorithm searches for an implementation with a type signature to
which all of the inputs can be cast "safely." The first one it finds
in its internal list of loops is selected and performed, after all
necessary type casting. Recall that internal copies during ufuncs (even
for casting) are limited to the size of an internal buffer (which is user
settable).
.. note::
Universal functions in NumPy are flexible enough to have mixed type
signatures. Thus, for example, a universal function could be defined
that works with floating-point and integer values. See :func:`ldexp`
for an example.
By the above description, the casting rules are essentially
implemented by the question of when a data type can be cast "safely"
to another data type. The answer to this question can be determined in
Python with a function call: :func:`can_cast(fromtype, totype)
<can_cast>`. The Figure below shows the results of this call for
the 21 internally supported types on the author's 32-bit system. You
can generate this table for your system with the code given in the Figure.
.. admonition:: Figure
Code segment showing the "can cast safely" table for a 32-bit system.
>>> def print_table(ntypes):
... print 'X',
... for char in ntypes: print char,
... print
... for row in ntypes:
... print row,
... for col in ntypes:
... print int(np.can_cast(row, col)),
... print
>>> print_table(np.typecodes['All'])
X ? b h i l q p B H I L Q P f d g F D G S U V O
? 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
b 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
h 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
i 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1
l 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1
q 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1
p 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1
B 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
H 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
I 0 0 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1
L 0 0 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1
Q 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 1 1 1 1 1
P 0 0 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1
f 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1
g 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 1 1
F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
G 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
U 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
You should note that, while included in the table for completeness,
the 'S', 'U', and 'V' types cannot be operated on by ufuncs. Also,
note that on a 64-bit system the integer types may have different
sizes, resulting in a slightly altered table.
Mixed scalar-array operations use a different set of casting rules
that ensure that a scalar cannot "upcast" an array unless the scalar is
of a fundamentally different kind of data (*i.e.*, under a different
hierarchy in the data-type hierarchy) than the array. This rule
enables you to use scalar constants in your code (which, as Python
types, are interpreted accordingly in ufuncs) without worrying about
whether the precision of the scalar constant will cause upcasting on
your large (small precision) array.
:class:`ufunc`
==============
Optional keyword arguments
--------------------------
All ufuncs take optional keyword arguments. These represent rather
advanced usage and will not typically be used by most Numpy users.
.. index::
pair: ufunc; keyword arguments
*sig*
Either a data-type, a tuple of data-types, or a special signature
string indicating the input and output types of a ufunc. This argument
allows you to provide a specific signature for the 1-d loop to use
in the underlying calculation. If the loop specified does not exist
for the ufunc, then a TypeError is raised. Normally, a suitable loop is
found automatically by comparing the input types with what is
available and searching for a loop with data-types to which all inputs
can be cast safely. This keyword argument lets you bypass that
search and choose a particular loop. A list of available signatures is
provided by the **types** attribute of the ufunc object.
*extobj*
a list of length 1, 2, or 3 specifying the ufunc buffer-size, the
error mode integer, and the error call-back function. Normally, these
values are looked up in a thread-specific dictionary. Passing them
here circumvents that look up and uses the low-level specification
provided for the error mode. This may be useful, for example, as an
optimization for calculations requiring many ufunc calls on small arrays
in a loop.
Attributes
----------
There are some informational attributes that universal functions
possess. None of the attributes can be set.
.. index::
pair: ufunc; attributes
============ =================================================================
**__doc__** A docstring for each ufunc. The first part of the docstring is
dynamically generated from the number of outputs, the name, and
the number of inputs. The second part of the docstring is
provided at creation time and stored with the ufunc.
**__name__** The name of the ufunc.
============ =================================================================
.. autosummary::
:toctree: generated/
ufunc.nin
ufunc.nout
ufunc.nargs
ufunc.ntypes
ufunc.types
ufunc.identity
Methods
-------
All ufuncs have four methods. However, these methods only make sense on
ufuncs that take two input arguments and return one output argument.
Attempting to call these methods on other ufuncs will cause a
:exc:`ValueError`. The reduce-like methods all take an *axis* keyword
and a *dtype* keyword, and the arrays must all have dimension >= 1.
The *axis* keyword specifies the axis of the array over which the reduction
will take place and may be negative, but must be an integer. The
*dtype* keyword allows you to manage a very common problem that arises
when naively using :ref:`{op}.reduce <ufunc.reduce>`. Sometimes you may
have an array of a certain data type and wish to add up all of its
elements, but the result does not fit into the data type of the
array. This commonly happens if you have an array of single-byte
integers. The *dtype* keyword allows you to alter the data type over which
the reduction takes place (and therefore the type of the output). Thus,
you can ensure that the output is a data type with precision large enough
to handle your output. The responsibility of altering the reduce type is
mostly up to you. There is one exception: if no *dtype* is given for a
reduction on the "add" or "multiply" operations, then if the input type is
an integer (or Boolean) data-type and smaller than the size of the
:class:`int_` data type, it will be internally upcast to the :class:`int_`
(or :class:`uint`) data-type.
.. index::
pair: ufunc; methods
.. autosummary::
:toctree: generated/
ufunc.reduce
ufunc.accumulate
ufunc.reduceat
ufunc.outer
.. warning::
A reduce-like operation on an array with a data-type that has a
range "too small" to handle the result will silently wrap. One
should use `dtype` to increase the size of the data-type over which
reduction takes place.
Available ufuncs
================
There are currently more than 60 universal functions defined in
:mod:`numpy` on one or more types, covering a wide variety of
operations. Some of these ufuncs are called automatically on arrays
when the relevant infix notation is used (*e.g.*, :func:`add(a, b) <add>`
is called internally when ``a + b`` is written and *a* or *b* is an
:class:`ndarray`). Nevertheless, you may still want to use the ufunc
call in order to use the optional output argument(s) to place the
output(s) in an object (or objects) of your choice.
Recall that each ufunc operates element-by-element. Therefore, each
ufunc will be described as if acting on a set of scalar inputs to
return a set of scalar outputs.
.. note::
The ufunc still returns its output(s) even if you use the optional
output argument(s).
Math operations
---------------
.. autosummary::
add
subtract
multiply
divide
logaddexp
logaddexp2
true_divide
floor_divide
negative
power
remainder
mod
fmod
absolute
rint
sign
conj
exp
exp2
log
log2
log10
expm1
log1p
sqrt
square
reciprocal
ones_like
.. tip::
The optional output arguments can be used to help you save memory
for large calculations. If your arrays are large, complicated
expressions can take longer than absolutely necessary due to the
creation and (later) destruction of temporary calculation
spaces. For example, the expression ``G = a * b + c`` is equivalent to
``t1 = A * B; G = T1 + C; del t1``. It will be more quickly executed
as ``G = A * B; add(G, C, G)`` which is the same as
``G = A * B; G += C``.
Trigonometric functions
-----------------------
All trigonometric functions use radians when an angle is called for.
The ratio of degrees to radians is :math:`180^{\circ}/\pi.`
.. autosummary::
sin
cos
tan
arcsin
arccos
arctan
arctan2
hypot
sinh
cosh
tanh
arcsinh
arccosh
arctanh
deg2rad
rad2deg
Bit-twiddling functions
-----------------------
These function all require integer arguments and they manipulate the
bit-pattern of those arguments.
.. autosummary::
bitwise_and
bitwise_or
bitwise_xor
invert
left_shift
right_shift
Comparison functions
--------------------
.. autosummary::
greater
greater_equal
less
less_equal
not_equal
equal
.. warning::
Do not use the Python keywords ``and`` and ``or`` to combine
logical array expressions. These keywords will test the truth
value of the entire array (not element-by-element as you might
expect). Use the bitwise operators & and \| instead.
.. autosummary::
logical_and
logical_or
logical_xor
logical_not
.. warning::
The bit-wise operators & and \| are the proper way to perform
element-by-element array comparisons. Be sure you understand the
operator precedence: ``(a > 2) & (a < 5)`` is the proper syntax because
``a > 2 & a < 5`` will result in an error due to the fact that ``2 & a``
is evaluated first.
.. autosummary::
maximum
.. tip::
The Python function ``max()`` will find the maximum over a one-dimensional
array, but it will do so using a slower sequence interface. The reduce
method of the maximum ufunc is much faster. Also, the ``max()`` method
will not give answers you might expect for arrays with greater than
one dimension. The reduce method of minimum also allows you to compute
a total minimum over an array.
.. autosummary::
minimum
.. warning::
the behavior of ``maximum(a, b)`` is different than that of ``max(a, b)``.
As a ufunc, ``maximum(a, b)`` performs an element-by-element comparison
of `a` and `b` and chooses each element of the result according to which
element in the two arrays is larger. In contrast, ``max(a, b)`` treats
the objects `a` and `b` as a whole, looks at the (total) truth value of
``a > b`` and uses it to return either `a` or `b` (as a whole). A similar
difference exists between ``minimum(a, b)`` and ``min(a, b)``.
Floating functions
------------------
Recall that all of these functions work element-by-element over an
array, returning an array output. The description details only a
single operation.
.. autosummary::
isreal
iscomplex
isfinite
isinf
isnan
signbit
copysign
nextafter
modf
ldexp
frexp
fmod
floor
ceil
trunc

View file

@ -1,5 +0,0 @@
*************
Release Notes
*************
.. include:: ../release/1.3.0-notes.rst

Binary file not shown.

Before

Width:  |  Height:  |  Size: 18 KiB

View file

@ -1,7 +0,0 @@
************
Broadcasting
************
.. seealso:: :class:`numpy.broadcast`
.. automodule:: numpy.doc.broadcasting

View file

@ -1,5 +0,0 @@
*************
Byte-swapping
*************
.. automodule:: numpy.doc.byteswapping

View file

@ -1,9 +0,0 @@
.. _arrays.creation:
**************
Array creation
**************
.. seealso:: :ref:`Array creation routines <routines.array-creation>`
.. automodule:: numpy.doc.creation

View file

@ -1,9 +0,0 @@
.. _basics.indexing:
********
Indexing
********
.. seealso:: :ref:`Indexing routines <routines.indexing>`
.. automodule:: numpy.doc.indexing

View file

@ -1,444 +0,0 @@
.. sectionauthor:: Pierre Gerard-Marchant <pierregmcode@gmail.com>
*********************************************
Importing data with :func:`~numpy.genfromtxt`
*********************************************
Numpy provides several functions to create arrays from tabular data.
We focus here on the :func:`~numpy.genfromtxt` function.
In a nutshell, :func:`~numpy.genfromtxt` runs two main loops.
The first loop converts each line of the file in a sequence of strings.
The second loop converts each string to the appropriate data type.
This mechanism is slower than a single loop, but gives more flexibility.
In particular, :func:`~numpy.genfromtxt` is able to take missing data into account, when other faster and simpler functions like :func:`~numpy.loadtxt` cannot
.. note::
When giving examples, we will use the following conventions
>>> import numpy as np
>>> from StringIO import StringIO
Defining the input
==================
The only mandatory argument of :func:`~numpy.genfromtxt` is the source of the data.
It can be a string corresponding to the name of a local or remote file, or a file-like object with a :meth:`read` method (such as an actual file or a :class:`StringIO.StringIO` object).
If the argument is the URL of a remote file, this latter is automatically downloaded in the current directory.
The input file can be a text file or an archive.
Currently, the function recognizes :class:`gzip` and :class:`bz2` (`bzip2`) archives.
The type of the archive is determined by examining the extension of the file:
if the filename ends with ``'.gz'``, a :class:`gzip` archive is expected; if it ends with ``'bz2'``, a :class:`bzip2` archive is assumed.
Splitting the lines into columns
================================
The :keyword:`delimiter` argument
---------------------------------
Once the file is defined and open for reading, :func:`~numpy.genfromtxt` splits each non-empty line into a sequence of strings.
Empty or commented lines are just skipped.
The :keyword:`delimiter` keyword is used to define how the splitting should take place.
Quite often, a single character marks the separation between columns.
For example, comma-separated files (CSV) use a comma (``,``) or a semicolon (``;``) as delimiter.
>>> data = "1, 2, 3\n4, 5, 6"
>>> np.genfromtxt(StringIO(data), delimiter=",")
array([[ 1., 2., 3.],
[ 4., 5., 6.]])
Another common separator is ``"\t"``, the tabulation character.
However, we are not limited to a single character, any string will do.
By default, :func:`~numpy.genfromtxt` assumes ``delimiter=None``, meaning that the line is split along white spaces (including tabs) and that consecutive white spaces are considered as a single white space.
Alternatively, we may be dealing with a fixed-width file, where columns are defined as a given number of characters.
In that case, we need to set :keyword:`delimiter` to a single integer (if all the columns have the same size) or to a sequence of integers (if columns can have different sizes).
>>> data = " 1 2 3\n 4 5 67\n890123 4"
>>> np.genfromtxt(StringIO(data), delimiter=3)
array([[ 1., 2., 3.],
[ 4., 5., 67.],
[ 890., 123., 4.]])
>>> data = "123456789\n 4 7 9\n 4567 9"
>>> np.genfromtxt(StringIO(data), delimiter=(4, 3, 2))
array([[ 1234., 567., 89.],
[ 4., 7., 9.],
[ 4., 567., 9.]])
The :keyword:`autostrip` argument
---------------------------------
By default, when a line is decomposed into a series of strings, the individual entries are not stripped of leading nor trailing white spaces.
This behavior can be overwritten by setting the optional argument :keyword:`autostrip` to a value of ``True``.
>>> data = "1, abc , 2\n 3, xxx, 4"
>>> # Without autostrip
>>> np.genfromtxt(StringIO(data), dtype="|S5")
array([['1', ' abc ', ' 2'],
['3', ' xxx', ' 4']],
dtype='|S5')
>>> # With autostrip
>>> np.genfromtxt(StringIO(data), dtype="|S5", autostrip=True)
array([['1', 'abc', '2'],
['3', 'xxx', '4']],
dtype='|S5')
The :keyword:`comments` argument
--------------------------------
The optional argument :keyword:`comments` is used to define a character string that marks the beginning of a comment.
By default, :func:`~numpy.genfromtxt` assumes ``comments='#'``.
The comment marker may occur anywhere on the line.
Any character present after the comment marker(s) is simply ignored.
>>> data = """#
... # Skip me !
... # Skip me too !
... 1, 2
... 3, 4
... 5, 6 #This is the third line of the data
... 7, 8
... # And here comes the last line
... 9, 0
... """
>>> np.genfromtxt(StringIO(data), comments="#", delimiter=",")
[[ 1. 2.]
[ 3. 4.]
[ 5. 6.]
[ 7. 8.]
[ 9. 0.]]
.. note::
There is one notable exception to this behavior: if the optional argument ``names=True``, the first commented line will be examined for names.
Skipping lines and choosing columns
===================================
The :keyword:`skip_header` and :keyword:`skip_footer` arguments
---------------------------------------------------------------
The presence of a header in the file can hinder data processing.
In that case, we need to use the :keyword:`skip_header` optional argument.
The values of this argument must be an integer which corresponds to the number of lines to skip at the beginning of the file, before any other action is performed.
Similarly, we can skip the last ``n`` lines of the file by using the :keyword:`skip_footer` attribute and giving it a value of ``n``.
>>> data = "\n".join(str(i) for i in range(10))
>>> np.genfromtxt(StringIO(data),)
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> np.genfromtxt(StringIO(data),
... skip_header=3, skip_footer=5)
array([ 3., 4.])
By default, ``skip_header=0`` and ``skip_footer=0``, meaning that no lines are skipped.
The :keyword:`usecols` argument
-------------------------------
In some cases, we are not interested in all the columns of the data but only a few of them.
We can select which columns to import with the :keyword:`usecols` argument.
This argument accepts a single integer or a sequence of integers corresponding to the indices of the columns to import.
Remember that by convention, the first column has an index of 0.
Negative integers correspond to
For example, if we want to import only the first and the last columns, we can use ``usecols=(0, -1)``:
>>> data = "1 2 3\n4 5 6"
>>> np.genfromtxt(StringIO(data), usecols=(0, -1))
array([[ 1., 3.],
[ 4., 6.]])
If the columns have names, we can also select which columns to import by giving their name to the :keyword:`usecols` argument, either as a sequence of strings or a comma-separated string.
>>> data = "1 2 3\n4 5 6"
>>> np.genfromtxt(StringIO(data),
... names="a, b, c", usecols=("a", "c"))
array([(1.0, 3.0), (4.0, 6.0)],
dtype=[('a', '<f8'), ('c', '<f8')])
>>> np.genfromtxt(StringIO(data),
... names="a, b, c", usecols=("a, c"))
array([(1.0, 3.0), (4.0, 6.0)],
dtype=[('a', '<f8'), ('c', '<f8')])
Choosing the data type
======================
The main way to control how the sequences of strings we have read from the file are converted to other types is to set the :keyword:`dtype` argument.
Acceptable values for this argument are:
* a single type, such as ``dtype=float``.
The output will be 2D with the given dtype, unless a name has been associated with each column with the use of the :keyword:`names` argument (see below).
Note that ``dtype=float`` is the default for :func:`~numpy.genfromtxt`.
* a sequence of types, such as ``dtype=(int, float, float)``.
* a comma-separated string, such as ``dtype="i4,f8,|S3"``.
* a dictionary with two keys ``'names'`` and ``'formats'``.
* a sequence of tuples ``(name, type)``, such as ``dtype=[('A', int), ('B', float)]``.
* an existing :class:`numpy.dtype` object.
* the special value ``None``.
In that case, the type of the columns will be determined from the data itself (see below).
In all the cases but the first one, the output will be a 1D array with a structured dtype.
This dtype has as many fields as items in the sequence.
The field names are defined with the :keyword:`names` keyword.
When ``dtype=None``, the type of each column is determined iteratively from its data.
We start by checking whether a string can be converted to a boolean (that is, if the string matches ``true`` or ``false`` in lower cases);
then whether it can be converted to an integer, then to a float, then to a complex and eventually to a string.
This behavior may be changed by modifying the default mapper of the :class:`~numpy.lib._iotools.StringConverter` class.
The option ``dtype=None`` is provided for convenience.
However, it is significantly slower than setting the dtype explicitly.
Setting the names
=================
The :keyword:`names` argument
-----------------------------
A natural approach when dealing with tabular data is to allocate a name to each column.
A first possibility is to use an explicit structured dtype, as mentioned previously.
>>> data = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data, dtype=[(_, int) for _ in "abc"])
array([(1, 2, 3), (4, 5, 6)],
dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])
Another simpler possibility is to use the :keyword:`names` keyword with a sequence of strings or a comma-separated string.
>>> data = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data, names="A, B, C")
array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<f8')])
In the example above, we used the fact that by default, ``dtype=float``.
By giving a sequence of names, we are forcing the output to a structured dtype.
We may sometimes need to define the column names from the data itself.
In that case, we must use the :keyword:`names` keyword with a value of ``True``.
The names will then be read from the first line (after the ``skip_header`` ones), even if the line is commented out.
>>> data = StringIO("So it goes\n#a b c\n1 2 3\n 4 5 6")
>>> np.genfromtxt(data, skip_header=1, names=True)
array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],
dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])
The default value of :keyword:`names` is ``None``.
If we give any other value to the keyword, the new names will overwrite the field names we may have defined with the dtype.
>>> data = StringIO("1 2 3\n 4 5 6")
>>> ndtype=[('a',int), ('b', float), ('c', int)]
>>> names = ["A", "B", "C"]
>>> np.genfromtxt(data, names=names, dtype=ndtype)
array([(1, 2.0, 3), (4, 5.0, 6)],
dtype=[('A', '<i8'), ('B', '<f8'), ('C', '<i8')])
The :keyword:`defaultfmt` argument
----------------------------------
If ``names=None`` but a structured dtype is expected, names are defined with the standard NumPy default of ``"f%i"``, yielding names like ``f0``, ``f1`` and so forth.
>>> data = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data, dtype=(int, float, int))
array([(1, 2.0, 3), (4, 5.0, 6)],
dtype=[('f0', '<i8'), ('f1', '<f8'), ('f2', '<i8')])
In the same way, if we don't give enough names to match the length of the dtype, the missing names will be defined with this default template.
>>> data = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data, dtype=(int, float, int), names="a")
array([(1, 2.0, 3), (4, 5.0, 6)],
dtype=[('a', '<i8'), ('f0', '<f8'), ('f1', '<i8')])
We can overwrite this default with the :keyword:`defaultfmt` argument, that takes any format string:
>>> data = StringIO("1 2 3\n 4 5 6")
>>> np.genfromtxt(data, dtype=(int, float, int), defaultfmt="var_%02i")
array([(1, 2.0, 3), (4, 5.0, 6)],
dtype=[('var_00', '<i8'), ('var_01', '<f8'), ('var_02', '<i8')])
.. note::
We need to keep in mind that ``defaultfmt`` is used only if some names are expected but not defined.
Validating names
----------------
Numpy arrays with a structured dtype can also be viewed as :class:`~numpy.recarray`, where a field can be accessed as if it were an attribute.
For that reason, we may need to make sure that the field name doesn't contain any space or invalid character, or that it does not correspond to the name of a standard attribute (like ``size`` or ``shape``), which would confuse the interpreter.
:func:`~numpy.genfromtxt` accepts three optional arguments that provide a finer control on the names:
:keyword:`deletechars`
Gives a string combining all the characters that must be deleted from the name. By default, invalid characters are ``~!@#$%^&*()-=+~\|]}[{';: /?.>,<``.
:keyword:`excludelist`
Gives a list of the names to exclude, such as ``return``, ``file``, ``print``...
If one of the input name is part of this list, an underscore character (``'_'``) will be appended to it.
:keyword:`case_sensitive`
Whether the names should be case-sensitive (``case_sensitive=True``),
converted to upper case (``case_sensitive=False`` or ``case_sensitive='upper'``) or to lower case (``case_sensitive='lower'``).
Tweaking the conversion
=======================
The :keyword:`converters` argument
----------------------------------
Usually, defining a dtype is sufficient to define how the sequence of strings must be converted.
However, some additional control may sometimes be required.
For example, we may want to make sure that a date in a format ``YYYY/MM/DD`` is converted to a :class:`datetime` object, or that a string like ``xx%`` is properly converted to a float between 0 and 1.
In such cases, we should define conversion functions with the :keyword:`converters` arguments.
The value of this argument is typically a dictionary with column indices or column names as keys and a conversion function as values.
These conversion functions can either be actual functions or lambda functions. In any case, they should accept only a string as input and output only a single element of the wanted type.
In the following example, the second column is converted from as string representing a percentage to a float between 0 and 1
>>> convertfunc = lambda x: float(x.strip("%"))/100.
>>> data = "1, 2.3%, 45.\n6, 78.9%, 0"
>>> names = ("i", "p", "n")
>>> # General case .....
>>> np.genfromtxt(StringIO(data), delimiter=",", names=names)
array([(1.0, nan, 45.0), (6.0, nan, 0.0)],
dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
We need to keep in mind that by default, ``dtype=float``.
A float is therefore expected for the second column.
However, the strings ``' 2.3%'`` and ``' 78.9%'`` cannot be converted to float and we end up having ``np.nan`` instead.
Let's now use a converter.
>>> # Converted case ...
>>> np.genfromtxt(StringIO(data), delimiter=",", names=names,
... converters={1: convertfunc})
array([(1.0, 0.023, 45.0), (6.0, 0.78900000000000003, 0.0)],
dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
The same results can be obtained by using the name of the second column (``"p"``) as key instead of its index (1).
>>> # Using a name for the converter ...
>>> np.genfromtxt(StringIO(data), delimiter=",", names=names,
... converters={"p": convertfunc})
array([(1.0, 0.023, 45.0), (6.0, 0.78900000000000003, 0.0)],
dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
Converters can also be used to provide a default for missing entries.
In the following example, the converter ``convert`` transforms a stripped string into the corresponding float or into -999 if the string is empty.
We need to explicitly strip the string from white spaces as it is not done by default.
>>> data = "1, , 3\n 4, 5, 6"
>>> convert = lambda x: float(x.strip() or -999)
>>> np.genfromtxt(StringIO(data), delimiter=",",
... converter={1: convert})
array([[ 1., -999., 3.],
[ 4., 5., 6.]])
Using missing and filling values
--------------------------------
Some entries may be missing in the dataset we are trying to import.
In a previous example, we used a converter to transform an empty string into a float.
However, user-defined converters may rapidly become cumbersome to manage.
The :func:`~nummpy.genfromtxt` function provides two other complementary mechanisms: the :keyword:`missing_values` argument is used to recognize missing data and a second argument, :keyword:`filling_values`, is used to process these missing data.
:keyword:`missing_values`
-------------------------
By default, any empty string is marked as missing.
We can also consider more complex strings, such as ``"N/A"`` or ``"???"`` to represent missing or invalid data.
The :keyword:`missing_values` argument accepts three kind of values:
a string or a comma-separated string
This string will be used as the marker for missing data for all the columns
a sequence of strings
In that case, each item is associated to a column, in order.
a dictionary
Values of the dictionary are strings or sequence of strings.
The corresponding keys can be column indices (integers) or column names (strings). In addition, the special key ``None`` can be used to define a default applicable to all columns.
:keyword:`filling_values`
-------------------------
We know how to recognize missing data, but we still need to provide a value for these missing entries.
By default, this value is determined from the expected dtype according to this table:
============= ==============
Expected type Default
============= ==============
``bool`` ``False``
``int`` ``-1``
``float`` ``np.nan``
``complex`` ``np.nan+0j``
``string`` ``'???'``
============= ==============
We can get a finer control on the conversion of missing values with the :keyword:`filling_values` optional argument.
Like :keyword:`missing_values`, this argument accepts different kind of values:
a single value
This will be the default for all columns
a sequence of values
Each entry will be the default for the corresponding column
a dictionary
Each key can be a column index or a column name, and the corresponding value should be a single object.
We can use the special key ``None`` to define a default for all columns.
In the following example, we suppose that the missing values are flagged with ``"N/A"`` in the first column and by ``"???"`` in the third column.
We wish to transform these missing values to 0 if they occur in the first and second column, and to -999 if they occur in the last column.
>>> data = "N/A, 2, 3\n4, ,???"
>>> kwargs = dict(delimiter=",",
... dtype=int,
... names="a,b,c",
... missing_values={0:"N/A", 'b':" ", 2:"???"},
... filling_values={0:0, 'b':0, 2:-999})
>>> np.genfromtxt(StringIO.StringIO(data), **kwargs)
array([(0, 2, 3), (4, 0, -999)],
dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])
:keyword:`usemask`
------------------
We may also want to keep track of the occurrence of missing data by constructing a boolean mask, with ``True`` entries where data was missing and ``False`` otherwise.
To do that, we just have to set the optional argument :keyword:`usemask` to ``True`` (the default is ``False``).
The output array will then be a :class:`~numpy.ma.MaskedArray`.
.. unpack=None, loose=True, invalid_raise=True)
Shortcut functions
==================
In addition to :func:`~numpy.genfromtxt`, the :mod:`numpy.lib.io` module provides several convenience functions derived from :func:`~numpy.genfromtxt`.
These functions work the same way as the original, but they have different default values.
:func:`~numpy.ndfromtxt`
Always set ``usemask=False``.
The output is always a standard :class:`numpy.ndarray`.
:func:`~numpy.mafromtxt`
Always set ``usemask=True``.
The output is always a :class:`~numpy.ma.MaskedArray`
:func:`~numpy.recfromtxt`
Returns a standard :class:`numpy.recarray` (if ``usemask=False``) or a :class:`~numpy.ma.MaskedRecords` array (if ``usemaske=True``).
The default dtype is ``dtype=None``, meaning that the types of each column will be automatically determined.
:func:`~numpy.recfromcsv`
Like :func:`~numpy.recfromtxt`, but with a default ``delimiter=","``.

View file

@ -1,8 +0,0 @@
**************
I/O with Numpy
**************
.. toctree::
:maxdepth: 2
basics.io.genfromtxt

View file

@ -1,7 +0,0 @@
.. _structured_arrays:
***************************************
Structured arrays (aka "Record arrays")
***************************************
.. automodule:: numpy.doc.structured_arrays

View file

@ -1,15 +0,0 @@
************
Numpy basics
************
.. toctree::
:maxdepth: 2
basics.types
basics.creation
basics.io
basics.indexing
basics.broadcasting
basics.byteswapping
basics.rec
basics.subclassing

View file

@ -1,7 +0,0 @@
.. _basics.subclassing:
*******************
Subclassing ndarray
*******************
.. automodule:: numpy.doc.subclassing

View file

@ -1,7 +0,0 @@
**********
Data types
**********
.. seealso:: :ref:`Data type objects <arrays.dtypes>`
.. automodule:: numpy.doc.basics

View file

@ -1,740 +0,0 @@
*****************
Beyond the Basics
*****************
| The voyage of discovery is not in seeking new landscapes but in having
| new eyes.
| --- *Marcel Proust*
| Discovery is seeing what everyone else has seen and thinking what no
| one else has thought.
| --- *Albert Szent-Gyorgi*
Iterating over elements in the array
====================================
.. _`sec:array_iterator`:
Basic Iteration
---------------
One common algorithmic requirement is to be able to walk over all
elements in a multidimensional array. The array iterator object makes
this easy to do in a generic way that works for arrays of any
dimension. Naturally, if you know the number of dimensions you will be
using, then you can always write nested for loops to accomplish the
iteration. If, however, you want to write code that works with any
number of dimensions, then you can make use of the array iterator. An
array iterator object is returned when accessing the .flat attribute
of an array.
.. index::
single: array iterator
Basic usage is to call :cfunc:`PyArray_IterNew` ( ``array`` ) where array
is an ndarray object (or one of its sub-classes). The returned object
is an array-iterator object (the same object returned by the .flat
attribute of the ndarray). This object is usually cast to
PyArrayIterObject* so that its members can be accessed. The only
members that are needed are ``iter->size`` which contains the total
size of the array, ``iter->index``, which contains the current 1-d
index into the array, and ``iter->dataptr`` which is a pointer to the
data for the current element of the array. Sometimes it is also
useful to access ``iter->ao`` which is a pointer to the underlying
ndarray object.
After processing data at the current element of the array, the next
element of the array can be obtained using the macro
:cfunc:`PyArray_ITER_NEXT` ( ``iter`` ). The iteration always proceeds in a
C-style contiguous fashion (last index varying the fastest). The
:cfunc:`PyArray_ITER_GOTO` ( ``iter``, ``destination`` ) can be used to
jump to a particular point in the array, where ``destination`` is an
array of npy_intp data-type with space to handle at least the number
of dimensions in the underlying array. Occasionally it is useful to
use :cfunc:`PyArray_ITER_GOTO1D` ( ``iter``, ``index`` ) which will jump
to the 1-d index given by the value of ``index``. The most common
usage, however, is given in the following example.
.. code-block:: c
PyObject *obj; /* assumed to be some ndarray object */
PyArrayIterObject *iter;
...
iter = (PyArrayIterObject *)PyArray_IterNew(obj);
if (iter == NULL) goto fail; /* Assume fail has clean-up code */
while (iter->index < iter->size) {
/* do something with the data at it->dataptr */
PyArray_ITER_NEXT(it);
}
...
You can also use :cfunc:`PyArrayIter_Check` ( ``obj`` ) to ensure you have
an iterator object and :cfunc:`PyArray_ITER_RESET` ( ``iter`` ) to reset an
iterator object back to the beginning of the array.
It should be emphasized at this point that you may not need the array
iterator if your array is already contiguous (using an array iterator
will work but will be slower than the fastest code you could write).
The major purpose of array iterators is to encapsulate iteration over
N-dimensional arrays with arbitrary strides. They are used in many,
many places in the NumPy source code itself. If you already know your
array is contiguous (Fortran or C), then simply adding the element-
size to a running pointer variable will step you through the array
very efficiently. In other words, code like this will probably be
faster for you in the contiguous case (assuming doubles).
.. code-block:: c
npy_intp size;
double *dptr; /* could make this any variable type */
size = PyArray_SIZE(obj);
dptr = PyArray_DATA(obj);
while(size--) {
/* do something with the data at dptr */
dptr++;
}
Iterating over all but one axis
-------------------------------
A common algorithm is to loop over all elements of an array and
perform some function with each element by issuing a function call. As
function calls can be time consuming, one way to speed up this kind of
algorithm is to write the function so it takes a vector of data and
then write the iteration so the function call is performed for an
entire dimension of data at a time. This increases the amount of work
done per function call, thereby reducing the function-call over-head
to a small(er) fraction of the total time. Even if the interior of the
loop is performed without a function call it can be advantageous to
perform the inner loop over the dimension with the highest number of
elements to take advantage of speed enhancements available on micro-
processors that use pipelining to enhance fundmental operations.
The :cfunc:`PyArray_IterAllButAxis` ( ``array``, ``&dim`` ) constructs an
iterator object that is modified so that it will not iterate over the
dimension indicated by dim. The only restriction on this iterator
object, is that the :cfunc:`PyArray_Iter_GOTO1D` ( ``it``, ``ind`` ) macro
cannot be used (thus flat indexing won't work either if you pass this
object back to Python --- so you shouldn't do this). Note that the
returned object from this routine is still usually cast to
PyArrayIterObject \*. All that's been done is to modify the strides
and dimensions of the returned iterator to simulate iterating over
array[...,0,...] where 0 is placed on the
:math:`\textrm{dim}^{\textrm{th}}` dimension. If dim is negative, then
the dimension with the largest axis is found and used.
Iterating over multiple arrays
------------------------------
Very often, it is desireable to iterate over several arrays at the
same time. The universal functions are an example of this kind of
behavior. If all you want to do is iterate over arrays with the same
shape, then simply creating several iterator objects is the standard
procedure. For example, the following code iterates over two arrays
assumed to be the same shape and size (actually obj1 just has to have
at least as many total elements as does obj2):
.. code-block:: c
/* It is already assumed that obj1 and obj2
are ndarrays of the same shape and size.
*/
iter1 = (PyArrayIterObject *)PyArray_IterNew(obj1);
if (iter1 == NULL) goto fail;
iter2 = (PyArrayIterObject *)PyArray_IterNew(obj2);
if (iter2 == NULL) goto fail; /* assume iter1 is DECREF'd at fail */
while (iter2->index < iter2->size) {
/* process with iter1->dataptr and iter2->dataptr */
PyArray_ITER_NEXT(iter1);
PyArray_ITER_NEXT(iter2);
}
Broadcasting over multiple arrays
---------------------------------
.. index::
single: broadcasting
When multiple arrays are involved in an operation, you may want to use the
same broadcasting rules that the math operations (*i.e.* the ufuncs) use.
This can be done easily using the :ctype:`PyArrayMultiIterObject`. This is
the object returned from the Python command numpy.broadcast and it is almost
as easy to use from C. The function
:cfunc:`PyArray_MultiIterNew` ( ``n``, ``...`` ) is used (with ``n`` input
objects in place of ``...`` ). The input objects can be arrays or anything
that can be converted into an array. A pointer to a PyArrayMultiIterObject is
returned. Broadcasting has already been accomplished which adjusts the
iterators so that all that needs to be done to advance to the next element in
each array is for PyArray_ITER_NEXT to be called for each of the inputs. This
incrementing is automatically performed by
:cfunc:`PyArray_MultiIter_NEXT` ( ``obj`` ) macro (which can handle a
multiterator ``obj`` as either a :ctype:`PyArrayMultiObject *` or a
:ctype:`PyObject *`). The data from input number ``i`` is available using
:cfunc:`PyArray_MultiIter_DATA` ( ``obj``, ``i`` ) and the total (broadcasted)
size as :cfunc:`PyArray_MultiIter_SIZE` ( ``obj``). An example of using this
feature follows.
.. code-block:: c
mobj = PyArray_MultiIterNew(2, obj1, obj2);
size = PyArray_MultiIter_SIZE(obj);
while(size--) {
ptr1 = PyArray_MultiIter_DATA(mobj, 0);
ptr2 = PyArray_MultiIter_DATA(mobj, 1);
/* code using contents of ptr1 and ptr2 */
PyArray_MultiIter_NEXT(mobj);
}
The function :cfunc:`PyArray_RemoveLargest` ( ``multi`` ) can be used to
take a multi-iterator object and adjust all the iterators so that
iteration does not take place over the largest dimension (it makes
that dimension of size 1). The code being looped over that makes use
of the pointers will very-likely also need the strides data for each
of the iterators. This information is stored in
multi->iters[i]->strides.
.. index::
single: array iterator
There are several examples of using the multi-iterator in the NumPy
source code as it makes N-dimensional broadcasting-code very simple to
write. Browse the source for more examples.
.. _`sec:Creating-a-new`:
Creating a new universal function
=================================
.. index::
pair: ufunc; adding new
The umath module is a computer-generated C-module that creates many
ufuncs. It provides a great many examples of how to create a universal
function. Creating your own ufunc that will make use of the ufunc
machinery is not difficult either. Suppose you have a function that
you want to operate element-by-element over its inputs. By creating a
new ufunc you will obtain a function that handles
- broadcasting
- N-dimensional looping
- automatic type-conversions with minimal memory usage
- optional output arrays
It is not difficult to create your own ufunc. All that is required is
a 1-d loop for each data-type you want to support. Each 1-d loop must
have a specific signature, and only ufuncs for fixed-size data-types
can be used. The function call used to create a new ufunc to work on
built-in data-types is given below. A different mechanism is used to
register ufuncs for user-defined data-types.
.. cfunction:: PyObject *PyUFunc_FromFuncAndData( PyUFuncGenericFunction* func,
void** data, char* types, int ntypes, int nin, int nout, int identity,
char* name, char* doc, int check_return)
*func*
A pointer to an array of 1-d functions to use. This array must be at
least ntypes long. Each entry in the array must be a
``PyUFuncGenericFunction`` function. This function has the following
signature. An example of a valid 1d loop function is also given.
.. cfunction:: void loop1d(char** args, npy_intp* dimensions,
npy_intp* steps, void* data)
*args*
An array of pointers to the actual data for the input and output
arrays. The input arguments are given first followed by the output
arguments.
*dimensions*
A pointer to the size of the dimension over which this function is
looping.
*steps*
A pointer to the number of bytes to jump to get to the
next element in this dimension for each of the input and
output arguments.
*data*
Arbitrary data (extra arguments, function names, *etc.* )
that can be stored with the ufunc and will be passed in
when it is called.
.. code-block:: c
static void
double_add(char *args, npy_intp *dimensions, npy_intp *steps,
void *extra)
{
npy_intp i;
npy_intp is1=steps[0], is2=steps[1];
npy_intp os=steps[2], n=dimensions[0];
char *i1=args[0], *i2=args[1], *op=args[2];
for (i=0; i<n; i++) {
*((double *)op) = *((double *)i1) + \
*((double *)i2);
i1 += is1; i2 += is2; op += os;
}
}
*data*
An array of data. There should be ntypes entries (or NULL) --- one for
every loop function defined for this ufunc. This data will be passed
in to the 1-d loop. One common use of this data variable is to pass in
an actual function to call to compute the result when a generic 1-d
loop (e.g. :cfunc:`PyUFunc_d_d`) is being used.
*types*
An array of type-number signatures (type ``char`` ). This
array should be of size (nin+nout)*ntypes and contain the
data-types for the corresponding 1-d loop. The inputs should
be first followed by the outputs. For example, suppose I have
a ufunc that supports 1 integer and 1 double 1-d loop
(length-2 func and data arrays) that takes 2 inputs and
returns 1 output that is always a complex double, then the
types array would be
The bit-width names can also be used (e.g. :cdata:`NPY_INT32`,
:cdata:`NPY_COMPLEX128` ) if desired.
*ntypes*
The number of data-types supported. This is equal to the number of 1-d
loops provided.
*nin*
The number of input arguments.
*nout*
The number of output arguments.
*identity*
Either :cdata:`PyUFunc_One`, :cdata:`PyUFunc_Zero`,
:cdata:`PyUFunc_None`. This specifies what should be returned when
an empty array is passed to the reduce method of the ufunc.
*name*
A ``NULL`` -terminated string providing the name of this ufunc
(should be the Python name it will be called).
*doc*
A documentation string for this ufunc (will be used in generating the
response to ``{ufunc_name}.__doc__``). Do not include the function
signature or the name as this is generated automatically.
*check_return*
Not presently used, but this integer value does get set in the
structure-member of similar name.
.. index::
pair: ufunc; adding new
The returned ufunc object is a callable Python object. It should be
placed in a (module) dictionary under the same name as was used in the
name argument to the ufunc-creation routine. The following example is
adapted from the umath module
.. code-block:: c
static PyUFuncGenericFunction atan2_functions[]=\
{PyUFunc_ff_f, PyUFunc_dd_d,
PyUFunc_gg_g, PyUFunc_OO_O_method};
static void* atan2_data[]=\
{(void *)atan2f,(void *) atan2,
(void *)atan2l,(void *)"arctan2"};
static char atan2_signatures[]=\
{NPY_FLOAT, NPY_FLOAT, NPY_FLOAT,
NPY_DOUBLE, NPY_DOUBLE,
NPY_DOUBLE, NPY_LONGDOUBLE,
NPY_LONGDOUBLE, NPY_LONGDOUBLE
NPY_OBJECT, NPY_OBJECT,
NPY_OBJECT};
...
/* in the module initialization code */
PyObject *f, *dict, *module;
...
dict = PyModule_GetDict(module);
...
f = PyUFunc_FromFuncAndData(atan2_functions,
atan2_data, atan2_signatures, 4, 2, 1,
PyUFunc_None, "arctan2",
"a safe and correct arctan(x1/x2)", 0);
PyDict_SetItemString(dict, "arctan2", f);
Py_DECREF(f);
...
.. _user.user-defined-data-types:
User-defined data-types
=======================
NumPy comes with 21 builtin data-types. While this covers a large
majority of possible use cases, it is conceivable that a user may have
a need for an additional data-type. There is some support for adding
an additional data-type into the NumPy system. This additional data-
type will behave much like a regular data-type except ufuncs must have
1-d loops registered to handle it separately. Also checking for
whether or not other data-types can be cast "safely" to and from this
new type or not will always return "can cast" unless you also register
which types your new data-type can be cast to and from. Adding
data-types is one of the less well-tested areas for NumPy 1.0, so
there may be bugs remaining in the approach. Only add a new data-type
if you can't do what you want to do using the OBJECT or VOID
data-types that are already available. As an example of what I
consider a useful application of the ability to add data-types is the
possibility of adding a data-type of arbitrary precision floats to
NumPy.
.. index::
pair: dtype; adding new
Adding the new data-type
------------------------
To begin to make use of the new data-type, you need to first define a
new Python type to hold the scalars of your new data-type. It should
be acceptable to inherit from one of the array scalars if your new
type has a binary compatible layout. This will allow your new data
type to have the methods and attributes of array scalars. New data-
types must have a fixed memory size (if you want to define a data-type
that needs a flexible representation, like a variable-precision
number, then use a pointer to the object as the data-type). The memory
layout of the object structure for the new Python type must be
PyObject_HEAD followed by the fixed-size memory needed for the data-
type. For example, a suitable structure for the new Python type is:
.. code-block:: c
typedef struct {
PyObject_HEAD;
some_data_type obval;
/* the name can be whatever you want */
} PySomeDataTypeObject;
After you have defined a new Python type object, you must then define
a new :ctype:`PyArray_Descr` structure whose typeobject member will contain a
pointer to the data-type you've just defined. In addition, the
required functions in the ".f" member must be defined: nonzero,
copyswap, copyswapn, setitem, getitem, and cast. The more functions in
the ".f" member you define, however, the more useful the new data-type
will be. It is very important to intialize unused functions to NULL.
This can be achieved using :cfunc:`PyArray_InitArrFuncs` (f).
Once a new :ctype:`PyArray_Descr` structure is created and filled with the
needed information and useful functions you call
:cfunc:`PyArray_RegisterDataType` (new_descr). The return value from this
call is an integer providing you with a unique type_number that
specifies your data-type. This type number should be stored and made
available by your module so that other modules can use it to recognize
your data-type (the other mechanism for finding a user-defined
data-type number is to search based on the name of the type-object
associated with the data-type using :cfunc:`PyArray_TypeNumFromName` ).
Registering a casting function
------------------------------
You may want to allow builtin (and other user-defined) data-types to
be cast automatically to your data-type. In order to make this
possible, you must register a casting function with the data-type you
want to be able to cast from. This requires writing low-level casting
functions for each conversion you want to support and then registering
these functions with the data-type descriptor. A low-level casting
function has the signature.
.. cfunction:: void castfunc( void* from, void* to, npy_intp n, void* fromarr,
void* toarr)
Cast ``n`` elements ``from`` one type ``to`` another. The data to
cast from is in a contiguous, correctly-swapped and aligned chunk
of memory pointed to by from. The buffer to cast to is also
contiguous, correctly-swapped and aligned. The fromarr and toarr
arguments should only be used for flexible-element-sized arrays
(string, unicode, void).
An example castfunc is:
.. code-block:: c
static void
double_to_float(double *from, float* to, npy_intp n,
void* ig1, void* ig2);
while (n--) {
(*to++) = (double) *(from++);
}
This could then be registered to convert doubles to floats using the
code:
.. code-block:: c
doub = PyArray_DescrFromType(NPY_DOUBLE);
PyArray_RegisterCastFunc(doub, NPY_FLOAT,
(PyArray_VectorUnaryFunc *)double_to_float);
Py_DECREF(doub);
Registering coercion rules
--------------------------
By default, all user-defined data-types are not presumed to be safely
castable to any builtin data-types. In addition builtin data-types are
not presumed to be safely castable to user-defined data-types. This
situation limits the ability of user-defined data-types to participate
in the coercion system used by ufuncs and other times when automatic
coercion takes place in NumPy. This can be changed by registering
data-types as safely castable from a particlar data-type object. The
function :cfunc:`PyArray_RegisterCanCast` (from_descr, totype_number,
scalarkind) should be used to specify that the data-type object
from_descr can be cast to the data-type with type number
totype_number. If you are not trying to alter scalar coercion rules,
then use :cdata:`PyArray_NOSCALAR` for the scalarkind argument.
If you want to allow your new data-type to also be able to share in
the scalar coercion rules, then you need to specify the scalarkind
function in the data-type object's ".f" member to return the kind of
scalar the new data-type should be seen as (the value of the scalar is
available to that function). Then, you can register data-types that
can be cast to separately for each scalar kind that may be returned
from your user-defined data-type. If you don't register scalar
coercion handling, then all of your user-defined data-types will be
seen as :cdata:`PyArray_NOSCALAR`.
Registering a ufunc loop
------------------------
You may also want to register low-level ufunc loops for your data-type
so that an ndarray of your data-type can have math applied to it
seamlessly. Registering a new loop with exactly the same arg_types
signature, silently replaces any previously registered loops for that
data-type.
Before you can register a 1-d loop for a ufunc, the ufunc must be
previously created. Then you call :cfunc:`PyUFunc_RegisterLoopForType`
(...) with the information needed for the loop. The return value of
this function is ``0`` if the process was successful and ``-1`` with
an error condition set if it was not successful.
.. cfunction:: int PyUFunc_RegisterLoopForType( PyUFuncObject* ufunc,
int usertype, PyUFuncGenericFunction function, int* arg_types, void* data)
*ufunc*
The ufunc to attach this loop to.
*usertype*
The user-defined type this loop should be indexed under. This number
must be a user-defined type or an error occurs.
*function*
The ufunc inner 1-d loop. This function must have the signature as
explained in Section `3 <#sec-creating-a-new>`__ .
*arg_types*
(optional) If given, this should contain an array of integers of at
least size ufunc.nargs containing the data-types expected by the loop
function. The data will be copied into a NumPy-managed structure so
the memory for this argument should be deleted after calling this
function. If this is NULL, then it will be assumed that all data-types
are of type usertype.
*data*
(optional) Specify any optional data needed by the function which will
be passed when the function is called.
.. index::
pair: dtype; adding new
Subtyping the ndarray in C
==========================
One of the lesser-used features that has been lurking in Python since
2.2 is the ability to sub-class types in C. This facility is one of
the important reasons for basing NumPy off of the Numeric code-base
which was already in C. A sub-type in C allows much more flexibility
with regards to memory management. Sub-typing in C is not difficult
even if you have only a rudimentary understanding of how to create new
types for Python. While it is easiest to sub-type from a single parent
type, sub-typing from multiple parent types is also possible. Multiple
inheritence in C is generally less useful than it is in Python because
a restriction on Python sub-types is that they have a binary
compatible memory layout. Perhaps for this reason, it is somewhat
easier to sub-type from a single parent type.
.. index::
pair: ndarray; subtyping
All C-structures corresponding to Python objects must begin with
:cmacro:`PyObject_HEAD` (or :cmacro:`PyObject_VAR_HEAD`). In the same
way, any sub-type must have a C-structure that begins with exactly the
same memory layout as the parent type (or all of the parent types in
the case of multiple-inheritance). The reason for this is that Python
may attempt to access a member of the sub-type structure as if it had
the parent structure ( *i.e.* it will cast a given pointer to a
pointer to the parent structure and then dereference one of it's
members). If the memory layouts are not compatible, then this attempt
will cause unpredictable behavior (eventually leading to a memory
violation and program crash).
One of the elements in :cmacro:`PyObject_HEAD` is a pointer to a
type-object structure. A new Python type is created by creating a new
type-object structure and populating it with functions and pointers to
describe the desired behavior of the type. Typically, a new
C-structure is also created to contain the instance-specific
information needed for each object of the type as well. For example,
:cdata:`&PyArray_Type` is a pointer to the type-object table for the ndarray
while a :ctype:`PyArrayObject *` variable is a pointer to a particular instance
of an ndarray (one of the members of the ndarray structure is, in
turn, a pointer to the type- object table :cdata:`&PyArray_Type`). Finally
:cfunc:`PyType_Ready` (<pointer_to_type_object>) must be called for
every new Python type.
Creating sub-types
------------------
To create a sub-type, a similar proceedure must be followed except
only behaviors that are different require new entries in the type-
object structure. All other entires can be NULL and will be filled in
by :cfunc:`PyType_Ready` with appropriate functions from the parent
type(s). In particular, to create a sub-type in C follow these steps:
1. If needed create a new C-structure to handle each instance of your
type. A typical C-structure would be:
.. code-block:: c
typedef _new_struct {
PyArrayObject base;
/* new things here */
} NewArrayObject;
Notice that the full PyArrayObject is used as the first entry in order
to ensure that the binary layout of instances of the new type is
identical to the PyArrayObject.
2. Fill in a new Python type-object structure with pointers to new
functions that will over-ride the default behavior while leaving any
function that should remain the same unfilled (or NULL). The tp_name
element should be different.
3. Fill in the tp_base member of the new type-object structure with a
pointer to the (main) parent type object. For multiple-inheritance,
also fill in the tp_bases member with a tuple containing all of the
parent objects in the order they should be used to define inheritance.
Remember, all parent-types must have the same C-structure for multiple
inheritance to work properly.
4. Call :cfunc:`PyType_Ready` (<pointer_to_new_type>). If this function
returns a negative number, a failure occurred and the type is not
initialized. Otherwise, the type is ready to be used. It is
generally important to place a reference to the new type into the
module dictionary so it can be accessed from Python.
More information on creating sub-types in C can be learned by reading
PEP 253 (available at http://www.python.org/dev/peps/pep-0253).
Specific features of ndarray sub-typing
---------------------------------------
Some special methods and attributes are used by arrays in order to
facilitate the interoperation of sub-types with the base ndarray type.
The __array_finalize\__ method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. attribute:: ndarray.__array_finalize__
Several array-creation functions of the ndarray allow
specification of a particular sub-type to be created. This allows
sub-types to be handled seamlessly in many routines. When a
sub-type is created in such a fashion, however, neither the
__new_\_ method nor the __init\__ method gets called. Instead, the
sub-type is allocated and the appropriate instance-structure
members are filled in. Finally, the :obj:`__array_finalize__`
attribute is looked-up in the object dictionary. If it is present
and not None, then it can be either a CObject containing a pointer
to a :cfunc:`PyArray_FinalizeFunc` or it can be a method taking a
single argument (which could be None).
If the :obj:`__array_finalize__` attribute is a CObject, then the pointer
must be a pointer to a function with the signature:
.. code-block:: c
(int) (PyArrayObject *, PyObject *)
The first argument is the newly created sub-type. The second argument
(if not NULL) is the "parent" array (if the array was created using
slicing or some other operation where a clearly-distinguishable parent
is present). This routine can do anything it wants to. It should
return a -1 on error and 0 otherwise.
If the :obj:`__array_finalize__` attribute is not None nor a CObject,
then it must be a Python method that takes the parent array as an
argument (which could be None if there is no parent), and returns
nothing. Errors in this method will be caught and handled.
The __array_priority\__ attribute
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. attribute:: ndarray.__array_priority__
This attribute allows simple but flexible determination of which sub-
type should be considered "primary" when an operation involving two or
more sub-types arises. In operations where different sub-types are
being used, the sub-type with the largest :obj:`__array_priority__`
attribute will determine the sub-type of the output(s). If two sub-
types have the same :obj:`__array_priority__` then the sub-type of the
first argument determines the output. The default
:obj:`__array_priority__` attribute returns a value of 0.0 for the base
ndarray type and 1.0 for a sub-type. This attribute can also be
defined by objects that are not sub-types of the ndarray and can be
used to determine which :obj:`__array_wrap__` method should be called for
the return output.
The __array_wrap\__ method
^^^^^^^^^^^^^^^^^^^^^^^^^^
.. attribute:: ndarray.__array_wrap__
Any class or type can define this method which should take an ndarray
argument and return an instance of the type. It can be seen as the
opposite of the :obj:`__array__` method. This method is used by the
ufuncs (and other NumPy functions) to allow other objects to pass
through. For Python >2.4, it can also be used to write a decorator
that converts a function that works only with ndarrays to one that
works with any type with :obj:`__array__` and :obj:`__array_wrap__` methods.
.. index::
pair: ndarray; subtyping

View file

@ -1,641 +0,0 @@
*******************
How to extend NumPy
*******************
| That which is static and repetitive is boring. That which is dynamic
| and random is confusing. In between lies art.
| --- *John A. Locke*
| Science is a differential equation. Religion is a boundary condition.
| --- *Alan Turing*
.. _`sec:Writing-an-extension`:
Writing an extension module
===========================
While the ndarray object is designed to allow rapid computation in
Python, it is also designed to be general-purpose and satisfy a wide-
variety of computational needs. As a result, if absolute speed is
essential, there is no replacement for a well-crafted, compiled loop
specific to your application and hardware. This is one of the reasons
that numpy includes f2py so that an easy-to-use mechanisms for linking
(simple) C/C++ and (arbitrary) Fortran code directly into Python are
available. You are encouraged to use and improve this mechanism. The
purpose of this section is not to document this tool but to document
the more basic steps to writing an extension module that this tool
depends on.
.. index::
single: extension module
When an extension module is written, compiled, and installed to
somewhere in the Python path (sys.path), the code can then be imported
into Python as if it were a standard python file. It will contain
objects and methods that have been defined and compiled in C code. The
basic steps for doing this in Python are well-documented and you can
find more information in the documentation for Python itself available
online at `www.python.org <http://www.python.org>`_ .
In addition to the Python C-API, there is a full and rich C-API for
NumPy allowing sophisticated manipulations on a C-level. However, for
most applications, only a few API calls will typically be used. If all
you need to do is extract a pointer to memory along with some shape
information to pass to another calculation routine, then you will use
very different calls, then if you are trying to create a new array-
like type or add a new data type for ndarrays. This chapter documents
the API calls and macros that are most commonly used.
Required subroutine
===================
There is exactly one function that must be defined in your C-code in
order for Python to use it as an extension module. The function must
be called init{name} where {name} is the name of the module from
Python. This function must be declared so that it is visible to code
outside of the routine. Besides adding the methods and constants you
desire, this subroutine must also contain calls to import_array()
and/or import_ufunc() depending on which C-API is needed. Forgetting
to place these commands will show itself as an ugly segmentation fault
(crash) as soon as any C-API subroutine is actually called. It is
actually possible to have multiple init{name} functions in a single
file in which case multiple modules will be defined by that file.
However, there are some tricks to get that to work correctly and it is
not covered here.
A minimal ``init{name}`` method looks like:
.. code-block:: c
PyMODINIT_FUNC
init{name}(void)
{
(void)Py_InitModule({name}, mymethods);
import_array();
}
The mymethods must be an array (usually statically declared) of
PyMethodDef structures which contain method names, actual C-functions,
a variable indicating whether the method uses keyword arguments or
not, and docstrings. These are explained in the next section. If you
want to add constants to the module, then you store the returned value
from Py_InitModule which is a module object. The most general way to
add itmes to the module is to get the module dictionary using
PyModule_GetDict(module). With the module dictionary, you can add
whatever you like to the module manually. An easier way to add objects
to the module is to use one of three additional Python C-API calls
that do not require a separate extraction of the module dictionary.
These are documented in the Python documentation, but repeated here
for convenience:
.. cfunction:: int PyModule_AddObject(PyObject* module, char* name, PyObject* value)
.. cfunction:: int PyModule_AddIntConstant(PyObject* module, char* name, long value)
.. cfunction:: int PyModule_AddStringConstant(PyObject* module, char* name, char* value)
All three of these functions require the *module* object (the
return value of Py_InitModule). The *name* is a string that
labels the value in the module. Depending on which function is
called, the *value* argument is either a general object
(:cfunc:`PyModule_AddObject` steals a reference to it), an integer
constant, or a string constant.
Defining functions
==================
The second argument passed in to the Py_InitModule function is a
structure that makes it easy to to define functions in the module. In
the example given above, the mymethods structure would have been
defined earlier in the file (usually right before the init{name}
subroutine) to:
.. code-block:: c
static PyMethodDef mymethods[] = {
{ nokeywordfunc,nokeyword_cfunc,
METH_VARARGS,
Doc string},
{ keywordfunc, keyword_cfunc,
METH_VARARGS|METH_KEYWORDS,
Doc string},
{NULL, NULL, 0, NULL} /* Sentinel */
}
Each entry in the mymethods array is a :ctype:`PyMethodDef` structure
containing 1) the Python name, 2) the C-function that implements the
function, 3) flags indicating whether or not keywords are accepted for
this function, and 4) The docstring for the function. Any number of
functions may be defined for a single module by adding more entries to
this table. The last entry must be all NULL as shown to act as a
sentinel. Python looks for this entry to know that all of the
functions for the module have been defined.
The last thing that must be done to finish the extension module is to
actually write the code that performs the desired functions. There are
two kinds of functions: those that don't accept keyword arguments, and
those that do.
Functions without keyword arguments
-----------------------------------
Functions that don't accept keyword arguments should be written as:
.. code-block:: c
static PyObject*
nokeyword_cfunc (PyObject *dummy, PyObject *args)
{
/* convert Python arguments */
/* do function */
/* return something */
}
The dummy argument is not used in this context and can be safely
ignored. The *args* argument contains all of the arguments passed in
to the function as a tuple. You can do anything you want at this
point, but usually the easiest way to manage the input arguments is to
call :cfunc:`PyArg_ParseTuple` (args, format_string,
addresses_to_C_variables...) or :cfunc:`PyArg_UnpackTuple` (tuple, "name" ,
min, max, ...). A good description of how to use the first function is
contained in the Python C-API reference manual under section 5.5
(Parsing arguments and building values). You should pay particular
attention to the "O&" format which uses converter functions to go
between the Python object and the C object. All of the other format
functions can be (mostly) thought of as special cases of this general
rule. There are several converter functions defined in the NumPy C-API
that may be of use. In particular, the :cfunc:`PyArray_DescrConverter`
function is very useful to support arbitrary data-type specification.
This function transforms any valid data-type Python object into a
:ctype:`PyArray_Descr *` object. Remember to pass in the address of the
C-variables that should be filled in.
There are lots of examples of how to use :cfunc:`PyArg_ParseTuple`
throughout the NumPy source code. The standard usage is like this:
.. code-block:: c
PyObject *input;
PyArray_Descr *dtype;
if (!PyArg_ParseTuple(args, "OO&", &input,
PyArray_DescrConverter,
&dtype)) return NULL;
It is important to keep in mind that you get a *borrowed* reference to
the object when using the "O" format string. However, the converter
functions usually require some form of memory handling. In this
example, if the conversion is successful, *dtype* will hold a new
reference to a :ctype:`PyArray_Descr *` object, while *input* will hold a
borrowed reference. Therefore, if this conversion were mixed with
another conversion (say to an integer) and the data-type conversion
was successful but the integer conversion failed, then you would need
to release the reference count to the data-type object before
returning. A typical way to do this is to set *dtype* to ``NULL``
before calling :cfunc:`PyArg_ParseTuple` and then use :cfunc:`Py_XDECREF`
on *dtype* before returning.
After the input arguments are processed, the code that actually does
the work is written (likely calling other functions as needed). The
final step of the C-function is to return something. If an error is
encountered then ``NULL`` should be returned (making sure an error has
actually been set). If nothing should be returned then increment
:cdata:`Py_None` and return it. If a single object should be returned then
it is returned (ensuring that you own a reference to it first). If
multiple objects should be returned then you need to return a tuple.
The :cfunc:`Py_BuildValue` (format_string, c_variables...) function makes
it easy to build tuples of Python objects from C variables. Pay
special attention to the difference between 'N' and 'O' in the format
string or you can easily create memory leaks. The 'O' format string
increments the reference count of the :ctype:`PyObject *` C-variable it
corresponds to, while the 'N' format string steals a reference to the
corresponding :ctype:`PyObject *` C-variable. You should use 'N' if you ave
already created a reference for the object and just want to give that
reference to the tuple. You should use 'O' if you only have a borrowed
reference to an object and need to create one to provide for the
tuple.
Functions with keyword arguments
--------------------------------
These functions are very similar to functions without keyword
arguments. The only difference is that the function signature is:
.. code-block:: c
static PyObject*
keyword_cfunc (PyObject *dummy, PyObject *args, PyObject *kwds)
{
...
}
The kwds argument holds a Python dictionary whose keys are the names
of the keyword arguments and whose values are the corresponding
keyword-argument values. This dictionary can be processed however you
see fit. The easiest way to handle it, however, is to replace the
:cfunc:`PyArg_ParseTuple` (args, format_string, addresses...) function with
a call to :cfunc:`PyArg_ParseTupleAndKeywords` (args, kwds, format_string,
char \*kwlist[], addresses...). The kwlist parameter to this function
is a ``NULL`` -terminated array of strings providing the expected
keyword arguments. There should be one string for each entry in the
format_string. Using this function will raise a TypeError if invalid
keyword arguments are passed in.
For more help on this function please see section 1.8 (Keyword
Paramters for Extension Functions) of the Extending and Embedding
tutorial in the Python documentation.
Reference counting
------------------
The biggest difficulty when writing extension modules is reference
counting. It is an important reason for the popularity of f2py, weave,
pyrex, ctypes, etc.... If you mis-handle reference counts you can get
problems from memory-leaks to segmentation faults. The only strategy I
know of to handle reference counts correctly is blood, sweat, and
tears. First, you force it into your head that every Python variable
has a reference count. Then, you understand exactly what each function
does to the reference count of your objects, so that you can properly
use DECREF and INCREF when you need them. Reference counting can
really test the amount of patience and diligence you have towards your
programming craft. Despite the grim depiction, most cases of reference
counting are quite straightforward with the most common difficulty
being not using DECREF on objects before exiting early from a routine
due to some error. In second place, is the common error of not owning
the reference on an object that is passed to a function or macro that
is going to steal the reference ( *e.g.* :cfunc:`PyTuple_SET_ITEM`, and
most functions that take :ctype:`PyArray_Descr` objects).
.. index::
single: reference counting
Typically you get a new reference to a variable when it is created or
is the return value of some function (there are some prominent
exceptions, however --- such as getting an item out of a tuple or a
dictionary). When you own the reference, you are responsible to make
sure that :cfunc:`Py_DECREF` (var) is called when the variable is no
longer necessary (and no other function has "stolen" its
reference). Also, if you are passing a Python object to a function
that will "steal" the reference, then you need to make sure you own it
(or use :cfunc:`Py_INCREF` to get your own reference). You will also
encounter the notion of borrowing a reference. A function that borrows
a reference does not alter the reference count of the object and does
not expect to "hold on "to the reference. It's just going to use the
object temporarily. When you use :cfunc:`PyArg_ParseTuple` or
:cfunc:`PyArg_UnpackTuple` you receive a borrowed reference to the
objects in the tuple and should not alter their reference count inside
your function. With practice, you can learn to get reference counting
right, but it can be frustrating at first.
One common source of reference-count errors is the :cfunc:`Py_BuildValue`
function. Pay careful attention to the difference between the 'N'
format character and the 'O' format character. If you create a new
object in your subroutine (such as an output array), and you are
passing it back in a tuple of return values, then you should most-
likely use the 'N' format character in :cfunc:`Py_BuildValue`. The 'O'
character will increase the reference count by one. This will leave
the caller with two reference counts for a brand-new array. When the
variable is deleted and the reference count decremented by one, there
will still be that extra reference count, and the array will never be
deallocated. You will have a reference-counting induced memory leak.
Using the 'N' character will avoid this situation as it will return to
the caller an object (inside the tuple) with a single reference count.
.. index::
single: reference counting
Dealing with array objects
==========================
Most extension modules for NumPy will need to access the memory for an
ndarray object (or one of it's sub-classes). The easiest way to do
this doesn't require you to know much about the internals of NumPy.
The method is to
1. Ensure you are dealing with a well-behaved array (aligned, in machine
byte-order and single-segment) of the correct type and number of
dimensions.
1. By converting it from some Python object using
:cfunc:`PyArray_FromAny` or a macro built on it.
2. By constructing a new ndarray of your desired shape and type
using :cfunc:`PyArray_NewFromDescr` or a simpler macro or function
based on it.
2. Get the shape of the array and a pointer to its actual data.
3. Pass the data and shape information on to a subroutine or other
section of code that actually performs the computation.
4. If you are writing the algorithm, then I recommend that you use the
stride information contained in the array to access the elements of
the array (the :cfunc:`PyArray_GETPTR` macros make this painless). Then,
you can relax your requirements so as not to force a single-segment
array and the data-copying that might result.
Each of these sub-topics is covered in the following sub-sections.
Converting an arbitrary sequence object
---------------------------------------
The main routine for obtaining an array from any Python object that
can be converted to an array is :cfunc:`PyArray_FromAny`. This
function is very flexible with many input arguments. Several macros
make it easier to use the basic function. :cfunc:`PyArray_FROM_OTF` is
arguably the most useful of these macros for the most common uses. It
allows you to convert an arbitrary Python object to an array of a
specific builtin data-type ( *e.g.* float), while specifying a
particular set of requirements ( *e.g.* contiguous, aligned, and
writeable). The syntax is
.. cfunction:: PyObject *PyArray_FROM_OTF(PyObject* obj, int typenum, int requirements)
Return an ndarray from any Python object, *obj*, that can be
converted to an array. The number of dimensions in the returned
array is determined by the object. The desired data-type of the
returned array is provided in *typenum* which should be one of the
enumerated types. The *requirements* for the returned array can be
any combination of standard array flags. Each of these arguments
is explained in more detail below. You receive a new reference to
the array on success. On failure, ``NULL`` is returned and an
exception is set.
*obj*
The object can be any Python object convertable to an ndarray.
If the object is already (a subclass of) the ndarray that
satisfies the requirements then a new reference is returned.
Otherwise, a new array is constructed. The contents of *obj*
are copied to the new array unless the array interface is used
so that data does not have to be copied. Objects that can be
converted to an array include: 1) any nested sequence object,
2) any object exposing the array interface, 3) any object with
an :obj:`__array__` method (which should return an ndarray),
and 4) any scalar object (becomes a zero-dimensional
array). Sub-classes of the ndarray that otherwise fit the
requirements will be passed through. If you want to ensure
a base-class ndarray, then use :cdata:`NPY_ENSUREARRAY` in the
requirements flag. A copy is made only if necessary. If you
want to guarantee a copy, then pass in :cdata:`NPY_ENSURECOPY`
to the requirements flag.
*typenum*
One of the enumerated types or :cdata:`NPY_NOTYPE` if the data-type
should be determined from the object itself. The C-based names
can be used:
:cdata:`NPY_BOOL`, :cdata:`NPY_BYTE`, :cdata:`NPY_UBYTE`,
:cdata:`NPY_SHORT`, :cdata:`NPY_USHORT`, :cdata:`NPY_INT`,
:cdata:`NPY_UINT`, :cdata:`NPY_LONG`, :cdata:`NPY_ULONG`,
:cdata:`NPY_LONGLONG`, :cdata:`NPY_ULONGLONG`, :cdata:`NPY_DOUBLE`,
:cdata:`NPY_LONGDOUBLE`, :cdata:`NPY_CFLOAT`, :cdata:`NPY_CDOUBLE`,
:cdata:`NPY_CLONGDOUBLE`, :cdata:`NPY_OBJECT`.
Alternatively, the bit-width names can be used as supported on the
platform. For example:
:cdata:`NPY_INT8`, :cdata:`NPY_INT16`, :cdata:`NPY_INT32`,
:cdata:`NPY_INT64`, :cdata:`NPY_UINT8`,
:cdata:`NPY_UINT16`, :cdata:`NPY_UINT32`,
:cdata:`NPY_UINT64`, :cdata:`NPY_FLOAT32`,
:cdata:`NPY_FLOAT64`, :cdata:`NPY_COMPLEX64`,
:cdata:`NPY_COMPLEX128`.
The object will be converted to the desired type only if it
can be done without losing precision. Otherwise ``NULL`` will
be returned and an error raised. Use :cdata:`NPY_FORCECAST` in the
requirements flag to override this behavior.
*requirements*
The memory model for an ndarray admits arbitrary strides in
each dimension to advance to the next element of the array.
Often, however, you need to interface with code that expects a
C-contiguous or a Fortran-contiguous memory layout. In
addition, an ndarray can be misaligned (the address of an
element is not at an integral multiple of the size of the
element) which can cause your program to crash (or at least
work more slowly) if you try and dereference a pointer into
the array data. Both of these problems can be solved by
converting the Python object into an array that is more
"well-behaved" for your specific usage.
The requirements flag allows specification of what kind of array is
acceptable. If the object passed in does not satisfy this requirements
then a copy is made so that thre returned object will satisfy the
requirements. these ndarray can use a very generic pointer to memory.
This flag allows specification of the desired properties of the
returned array object. All of the flags are explained in the detailed
API chapter. The flags most commonly needed are :cdata:`NPY_IN_ARRAY`,
:cdata:`NPY_OUT_ARRAY`, and :cdata:`NPY_INOUT_ARRAY`:
.. cvar:: NPY_IN_ARRAY
Equivalent to :cdata:`NPY_CONTIGUOUS` \|
:cdata:`NPY_ALIGNED`. This combination of flags is useful
for arrays that must be in C-contiguous order and aligned.
These kinds of arrays are usually input arrays for some
algorithm.
.. cvar:: NPY_OUT_ARRAY
Equivalent to :cdata:`NPY_CONTIGUOUS` \|
:cdata:`NPY_ALIGNED` \| :cdata:`NPY_WRITEABLE`. This
combination of flags is useful to specify an array that is
in C-contiguous order, is aligned, and can be written to
as well. Such an array is usually returned as output
(although normally such output arrays are created from
scratch).
.. cvar:: NPY_INOUT_ARRAY
Equivalent to :cdata:`NPY_CONTIGUOUS` \|
:cdata:`NPY_ALIGNED` \| :cdata:`NPY_WRITEABLE` \|
:cdata:`NPY_UPDATEIFCOPY`. This combination of flags is
useful to specify an array that will be used for both
input and output. If a copy is needed, then when the
temporary is deleted (by your use of :cfunc:`Py_DECREF` at
the end of the interface routine), the temporary array
will be copied back into the original array passed in. Use
of the :cdata:`UPDATEIFCOPY` flag requires that the input
object is already an array (because other objects cannot
be automatically updated in this fashion). If an error
occurs use :cfunc:`PyArray_DECREF_ERR` (obj) on an array
with the :cdata:`NPY_UPDATEIFCOPY` flag set. This will
delete the array without causing the contents to be copied
back into the original array.
Other useful flags that can be OR'd as additional requirements are:
.. cvar:: NPY_FORCECAST
Cast to the desired type, even if it can't be done without losing
information.
.. cvar:: NPY_ENSURECOPY
Make sure the resulting array is a copy of the original.
.. cvar:: NPY_ENSUREARRAY
Make sure the resulting object is an actual ndarray and not a sub-
class.
.. note::
Whether or not an array is byte-swapped is determined by the
data-type of the array. Native byte-order arrays are always
requested by :cfunc:`PyArray_FROM_OTF` and so there is no need for
a :cdata:`NPY_NOTSWAPPED` flag in the requirements argument. There
is also no way to get a byte-swapped array from this routine.
Creating a brand-new ndarray
----------------------------
Quite often new arrays must be created from within extension-module
code. Perhaps an output array is needed and you don't want the caller
to have to supply it. Perhaps only a temporary array is needed to hold
an intermediate calculation. Whatever the need there are simple ways
to get an ndarray object of whatever data-type is needed. The most
general function for doing this is :cfunc:`PyArray_NewFromDescr`. All array
creation functions go through this heavily re-used code. Because of
its flexibility, it can be somewhat confusing to use. As a result,
simpler forms exist that are easier to use.
.. cfunction:: PyObject *PyArray_SimpleNew(int nd, npy_intp* dims, int typenum)
This function allocates new memory and places it in an ndarray
with *nd* dimensions whose shape is determined by the array of
at least *nd* items pointed to by *dims*. The memory for the
array is uninitialized (unless typenum is :cdata:`PyArray_OBJECT` in
which case each element in the array is set to NULL). The
*typenum* argument allows specification of any of the builtin
data-types such as :cdata:`PyArray_FLOAT` or :cdata:`PyArray_LONG`. The
memory for the array can be set to zero if desired using
:cfunc:`PyArray_FILLWBYTE` (return_object, 0).
.. cfunction:: PyObject *PyArray_SimpleNewFromData( int nd, npy_intp* dims, int typenum, void* data)
Sometimes, you want to wrap memory allocated elsewhere into an
ndarray object for downstream use. This routine makes it
straightforward to do that. The first three arguments are the same
as in :cfunc:`PyArray_SimpleNew`, the final argument is a pointer to a
block of contiguous memory that the ndarray should use as it's
data-buffer which will be interpreted in C-style contiguous
fashion. A new reference to an ndarray is returned, but the
ndarray will not own its data. When this ndarray is deallocated,
the pointer will not be freed.
You should ensure that the provided memory is not freed while the
returned array is in existence. The easiest way to handle this is
if data comes from another reference-counted Python object. The
reference count on this object should be increased after the
pointer is passed in, and the base member of the returned ndarray
should point to the Python object that owns the data. Then, when
the ndarray is deallocated, the base-member will be DECREF'd
appropriately. If you want the memory to be freed as soon as the
ndarray is deallocated then simply set the OWNDATA flag on the
returned ndarray.
Getting at ndarray memory and accessing elements of the ndarray
---------------------------------------------------------------
If obj is an ndarray (:ctype:`PyArrayObject *`), then the data-area of the
ndarray is pointed to by the void* pointer :cfunc:`PyArray_DATA` (obj) or
the char* pointer :cfunc:`PyArray_BYTES` (obj). Remember that (in general)
this data-area may not be aligned according to the data-type, it may
represent byte-swapped data, and/or it may not be writeable. If the
data area is aligned and in native byte-order, then how to get at a
specific element of the array is determined only by the array of
npy_intp variables, :cfunc:`PyArray_STRIDES` (obj). In particular, this
c-array of integers shows how many **bytes** must be added to the
current element pointer to get to the next element in each dimension.
For arrays less than 4-dimensions there are :cfunc:`PyArray_GETPTR{k}`
(obj, ...) macros where {k} is the integer 1, 2, 3, or 4 that make
using the array strides easier. The arguments .... represent {k} non-
negative integer indices into the array. For example, suppose ``E`` is
a 3-dimensional ndarray. A (void*) pointer to the element ``E[i,j,k]``
is obtained as :cfunc:`PyArray_GETPTR3` (E, i, j, k).
As explained previously, C-style contiguous arrays and Fortran-style
contiguous arrays have particular striding patterns. Two array flags
(:cdata:`NPY_C_CONTIGUOUS` and :cdata`NPY_F_CONTIGUOUS`) indicate
whether or not the striding pattern of a particular array matches the
C-style contiguous or Fortran-style contiguous or neither. Whether or
not the striding pattern matches a standard C or Fortran one can be
tested Using :cfunc:`PyArray_ISCONTIGUOUS` (obj) and
:cfunc:`PyArray_ISFORTRAN` (obj) respectively. Most third-party
libraries expect contiguous arrays. But, often it is not difficult to
support general-purpose striding. I encourage you to use the striding
information in your own code whenever possible, and reserve
single-segment requirements for wrapping third-party code. Using the
striding information provided with the ndarray rather than requiring a
contiguous striding reduces copying that otherwise must be made.
Example
=======
.. index::
single: extension module
The following example shows how you might write a wrapper that accepts
two input arguments (that will be converted to an array) and an output
argument (that must be an array). The function returns None and
updates the output array.
.. code-block:: c
static PyObject *
example_wrapper(PyObject *dummy, PyObject *args)
{
PyObject *arg1=NULL, *arg2=NULL, *out=NULL;
PyObject *arr1=NULL, *arr2=NULL, *oarr=NULL;
if (!PyArg_ParseTuple(args, "OOO!", &arg1, &arg2,
&PyArray_Type, &out)) return NULL;
arr1 = PyArray_FROM_OTF(arg1, NPY_DOUBLE, NPY_IN_ARRAY);
if (arr1 == NULL) return NULL;
arr2 = PyArray_FROM_OTF(arg2, NPY_DOUBLE, NPY_IN_ARRAY);
if (arr2 == NULL) goto fail;
oarr = PyArray_FROM_OTF(out, NPY_DOUBLE, NPY_INOUT_ARRAY);
if (oarr == NULL) goto fail;
/* code that makes use of arguments */
/* You will probably need at least
nd = PyArray_NDIM(<..>) -- number of dimensions
dims = PyArray_DIMS(<..>) -- npy_intp array of length nd
showing length in each dim.
dptr = (double *)PyArray_DATA(<..>) -- pointer to data.
If an error occurs goto fail.
*/
Py_DECREF(arr1);
Py_DECREF(arr2);
Py_DECREF(oarr);
Py_INCREF(Py_None);
return Py_None;
fail:
Py_XDECREF(arr1);
Py_XDECREF(arr2);
PyArray_XDECREF_ERR(oarr);
return NULL;
}

File diff suppressed because it is too large Load diff

Some files were not shown because too many files have changed in this diff Show more