Package cherrypy :: Module _cpreqbody
[hide private]
[frames] | no frames]

Module _cpreqbody

source code

Request body processing for CherryPy.

.. versionadded:: 3.2

Application authors have complete control over the parsing of HTTP request
entities. In short, :attr:`cherrypy.request.body<cherrypy._cprequest.Request.body>`
is now always set to an instance of :class:`RequestBody<cherrypy._cpreqbody.RequestBody>`,
and *that* class is a subclass of :class:`Entity<cherrypy._cpreqbody.Entity>`.

When an HTTP request includes an entity body, it is often desirable to
provide that information to applications in a form other than the raw bytes.
Different content types demand different approaches. Examples:

 * For a GIF file, we want the raw bytes in a stream.
 * An HTML form is better parsed into its component fields, and each text field
   decoded from bytes to unicode.
 * A JSON body should be deserialized into a Python dict or list.

When the request contains a Content-Type header, the media type is used as a
key to look up a value in the
:attr:`request.body.processors<cherrypy._cpreqbody.Entity.processors>` dict.
If the full media
type is not found, then the major type is tried; for example, if no processor
is found for the 'image/jpeg' type, then we look for a processor for the 'image'
types altogether. If neither the full type nor the major type has a matching
processor, then a default processor is used
(:func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>`). For most
types, this means no processing is done, and the body is left unread as a
raw byte stream. Processors are configurable in an 'on_start_resource' hook.

Some processors, especially those for the 'text' types, attempt to decode bytes
to unicode. If the Content-Type request header includes a 'charset' parameter,
this is used to decode the entity. Otherwise, one or more default charsets may
be attempted, although this decision is up to each processor. If a processor
successfully decodes an Entity or Part, it should set the
:attr:`charset<cherrypy._cpreqbody.Entity.charset>` attribute
on the Entity or Part to the name of the successful charset, so that
applications can easily re-encode or transcode the value if they wish.

If the Content-Type of the request entity is of major type 'multipart', then
the above parsing process, and possibly a decoding process, is performed for
each part.

For both the full entity and multipart parts, a Content-Disposition header may
be used to fill :attr:`name<cherrypy._cpreqbody.Entity.name>` and
:attr:`filename<cherrypy._cpreqbody.Entity.filename>` attributes on the
request.body or the Part.

.. _custombodyprocessors:

Custom Processors
=================

You can add your own processors for any specific or major MIME type. Simply add
it to the :attr:`processors<cherrypy._cprequest.Entity.processors>` dict in a
hook/tool that runs at ``on_start_resource`` or ``before_request_body``. 
Here's the built-in JSON tool for an example::

    def json_in(force=True, debug=False):
        request = cherrypy.serving.request
        def json_processor(entity):
            """Read application/json data into request.json."""
            if not entity.headers.get("Content-Length", ""):
                raise cherrypy.HTTPError(411)
            
            body = entity.fp.read()
            try:
                request.json = json_decode(body)
            except ValueError:
                raise cherrypy.HTTPError(400, 'Invalid JSON document')
        if force:
            request.body.processors.clear()
            request.body.default_proc = cherrypy.HTTPError(
                415, 'Expected an application/json content type')
        request.body.processors['application/json'] = json_processor

We begin by defining a new ``json_processor`` function to stick in the ``processors``
dictionary. All processor functions take a single argument, the ``Entity`` instance
they are to process. It will be called whenever a request is received (for those
URI's where the tool is turned on) which has a ``Content-Type`` of
"application/json".

First, it checks for a valid ``Content-Length`` (raising 411 if not valid), then
reads the remaining bytes on the socket. The ``fp`` object knows its own length, so
it won't hang waiting for data that never arrives. It will return when all data
has been read. Then, we decode those bytes using Python's built-in ``json`` module,
and stick the decoded result onto ``request.json`` . If it cannot be decoded, we
raise 400.

If the "force" argument is True (the default), the ``Tool`` clears the ``processors``
dict so that request entities of other ``Content-Types`` aren't parsed at all. Since
there's no entry for those invalid MIME types, the ``default_proc`` method of ``cherrypy.request.body``
is called. But this does nothing by default (usually to provide the page handler an opportunity to handle it.)
But in our case, we want to raise 415, so we replace ``request.body.default_proc``
with the error (``HTTPError`` instances, when called, raise themselves).

If we were defining a custom processor, we can do so without making a ``Tool``. Just add the config entry::

    request.body.processors = {'application/json': json_processor}

Note that you can only replace the ``processors`` dict wholesale this way, not update the existing one.

Classes [hide private]
  Entity
An HTTP request body, or MIME multipart body.
  Part
A MIME part entity, part of a multipart entity.
  Infinity
  SizedReader
  RequestBody
The entity of the HTTP request.
Functions [hide private]
 
process_urlencoded(entity)
Read application/x-www-form-urlencoded data into entity.params.
source code
 
process_multipart(entity)
Read all multipart parts into entity.parts.
source code
 
process_multipart_form_data(entity)
Read all multipart/form-data parts into entity.parts or entity.params.
source code
 
_old_process_multipart(entity)
The behavior of 3.2 and lower.
source code
Variables [hide private]
  DEFAULT_BUFFER_SIZE = 8192
  inf = inf
  comma_separated_headers = ['Accept', 'Accept-Charset', 'Accept...
  __package__ = 'cherrypy'
Function Details [hide private]

_old_process_multipart(entity)

source code 

The behavior of 3.2 and lower. Deprecated and will be changed in 3.3.


Variables Details [hide private]

comma_separated_headers

Value:
['Accept',
 'Accept-Charset',
 'Accept-Encoding',
 'Accept-Language',
 'Accept-Ranges',
 'Allow',
 'Cache-Control',
 'Connection',
...