Package cherrypy :: Module _cpreqbody
[hide private]
[frames] | no frames]

Source Code for Module cherrypy._cpreqbody

  1  """Request body processing for CherryPy. 
  2   
  3  .. versionadded:: 3.2 
  4   
  5  Application authors have complete control over the parsing of HTTP request 
  6  entities. In short, :attr:`cherrypy.request.body<cherrypy._cprequest.Request.body>` 
  7  is now always set to an instance of :class:`RequestBody<cherrypy._cpreqbody.RequestBody>`, 
  8  and *that* class is a subclass of :class:`Entity<cherrypy._cpreqbody.Entity>`. 
  9   
 10  When an HTTP request includes an entity body, it is often desirable to 
 11  provide that information to applications in a form other than the raw bytes. 
 12  Different content types demand different approaches. Examples: 
 13   
 14   * For a GIF file, we want the raw bytes in a stream. 
 15   * An HTML form is better parsed into its component fields, and each text field 
 16     decoded from bytes to unicode. 
 17   * A JSON body should be deserialized into a Python dict or list. 
 18   
 19  When the request contains a Content-Type header, the media type is used as a 
 20  key to look up a value in the 
 21  :attr:`request.body.processors<cherrypy._cpreqbody.Entity.processors>` dict. 
 22  If the full media 
 23  type is not found, then the major type is tried; for example, if no processor 
 24  is found for the 'image/jpeg' type, then we look for a processor for the 'image' 
 25  types altogether. If neither the full type nor the major type has a matching 
 26  processor, then a default processor is used 
 27  (:func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>`). For most 
 28  types, this means no processing is done, and the body is left unread as a 
 29  raw byte stream. Processors are configurable in an 'on_start_resource' hook. 
 30   
 31  Some processors, especially those for the 'text' types, attempt to decode bytes 
 32  to unicode. If the Content-Type request header includes a 'charset' parameter, 
 33  this is used to decode the entity. Otherwise, one or more default charsets may 
 34  be attempted, although this decision is up to each processor. If a processor 
 35  successfully decodes an Entity or Part, it should set the 
 36  :attr:`charset<cherrypy._cpreqbody.Entity.charset>` attribute 
 37  on the Entity or Part to the name of the successful charset, so that 
 38  applications can easily re-encode or transcode the value if they wish. 
 39   
 40  If the Content-Type of the request entity is of major type 'multipart', then 
 41  the above parsing process, and possibly a decoding process, is performed for 
 42  each part. 
 43   
 44  For both the full entity and multipart parts, a Content-Disposition header may 
 45  be used to fill :attr:`name<cherrypy._cpreqbody.Entity.name>` and 
 46  :attr:`filename<cherrypy._cpreqbody.Entity.filename>` attributes on the 
 47  request.body or the Part. 
 48   
 49  .. _custombodyprocessors: 
 50   
 51  Custom Processors 
 52  ================= 
 53   
 54  You can add your own processors for any specific or major MIME type. Simply add 
 55  it to the :attr:`processors<cherrypy._cprequest.Entity.processors>` dict in a 
 56  hook/tool that runs at ``on_start_resource`` or ``before_request_body``.  
 57  Here's the built-in JSON tool for an example:: 
 58   
 59      def json_in(force=True, debug=False): 
 60          request = cherrypy.serving.request 
 61          def json_processor(entity): 
 62              \"""Read application/json data into request.json.\""" 
 63              if not entity.headers.get("Content-Length", ""): 
 64                  raise cherrypy.HTTPError(411) 
 65               
 66              body = entity.fp.read() 
 67              try: 
 68                  request.json = json_decode(body) 
 69              except ValueError: 
 70                  raise cherrypy.HTTPError(400, 'Invalid JSON document') 
 71          if force: 
 72              request.body.processors.clear() 
 73              request.body.default_proc = cherrypy.HTTPError( 
 74                  415, 'Expected an application/json content type') 
 75          request.body.processors['application/json'] = json_processor 
 76   
 77  We begin by defining a new ``json_processor`` function to stick in the ``processors`` 
 78  dictionary. All processor functions take a single argument, the ``Entity`` instance 
 79  they are to process. It will be called whenever a request is received (for those 
 80  URI's where the tool is turned on) which has a ``Content-Type`` of 
 81  "application/json". 
 82   
 83  First, it checks for a valid ``Content-Length`` (raising 411 if not valid), then 
 84  reads the remaining bytes on the socket. The ``fp`` object knows its own length, so 
 85  it won't hang waiting for data that never arrives. It will return when all data 
 86  has been read. Then, we decode those bytes using Python's built-in ``json`` module, 
 87  and stick the decoded result onto ``request.json`` . If it cannot be decoded, we 
 88  raise 400. 
 89   
 90  If the "force" argument is True (the default), the ``Tool`` clears the ``processors`` 
 91  dict so that request entities of other ``Content-Types`` aren't parsed at all. Since 
 92  there's no entry for those invalid MIME types, the ``default_proc`` method of ``cherrypy.request.body`` 
 93  is called. But this does nothing by default (usually to provide the page handler an opportunity to handle it.) 
 94  But in our case, we want to raise 415, so we replace ``request.body.default_proc`` 
 95  with the error (``HTTPError`` instances, when called, raise themselves). 
 96   
 97  If we were defining a custom processor, we can do so without making a ``Tool``. Just add the config entry:: 
 98   
 99      request.body.processors = {'application/json': json_processor} 
100   
101  Note that you can only replace the ``processors`` dict wholesale this way, not update the existing one. 
102  """ 
103   
104  try: 
105      from io import DEFAULT_BUFFER_SIZE 
106  except ImportError: 
107      DEFAULT_BUFFER_SIZE = 8192 
108  import re 
109  import sys 
110  import tempfile 
111  try: 
112      from urllib import unquote_plus 
113  except ImportError: 
114 - def unquote_plus(bs):
115 """Bytes version of urllib.parse.unquote_plus.""" 116 bs = bs.replace(ntob('+'), ntob(' ')) 117 atoms = bs.split(ntob('%')) 118 for i in range(1, len(atoms)): 119 item = atoms[i] 120 try: 121 pct = int(item[:2], 16) 122 atoms[i] = bytes([pct]) + item[2:] 123 except ValueError: 124 pass 125 return ntob('').join(atoms)
126 127 import cherrypy 128 from cherrypy._cpcompat import basestring, ntob, ntou 129 from cherrypy.lib import httputil 130 131 132 # -------------------------------- Processors -------------------------------- # 133
134 -def process_urlencoded(entity):
135 """Read application/x-www-form-urlencoded data into entity.params.""" 136 qs = entity.fp.read() 137 for charset in entity.attempt_charsets: 138 try: 139 params = {} 140 for aparam in qs.split(ntob('&')): 141 for pair in aparam.split(ntob(';')): 142 if not pair: 143 continue 144 145 atoms = pair.split(ntob('='), 1) 146 if len(atoms) == 1: 147 atoms.append(ntob('')) 148 149 key = unquote_plus(atoms[0]).decode(charset) 150 value = unquote_plus(atoms[1]).decode(charset) 151 152 if key in params: 153 if not isinstance(params[key], list): 154 params[key] = [params[key]] 155 params[key].append(value) 156 else: 157 params[key] = value 158 except UnicodeDecodeError: 159 pass 160 else: 161 entity.charset = charset 162 break 163 else: 164 raise cherrypy.HTTPError( 165 400, "The request entity could not be decoded. The following " 166 "charsets were attempted: %s" % repr(entity.attempt_charsets)) 167 168 # Now that all values have been successfully parsed and decoded, 169 # apply them to the entity.params dict. 170 for key, value in params.items(): 171 if key in entity.params: 172 if not isinstance(entity.params[key], list): 173 entity.params[key] = [entity.params[key]] 174 entity.params[key].append(value) 175 else: 176 entity.params[key] = value
177 178
179 -def process_multipart(entity):
180 """Read all multipart parts into entity.parts.""" 181 ib = "" 182 if 'boundary' in entity.content_type.params: 183 # http://tools.ietf.org/html/rfc2046#section-5.1.1 184 # "The grammar for parameters on the Content-type field is such that it 185 # is often necessary to enclose the boundary parameter values in quotes 186 # on the Content-type line" 187 ib = entity.content_type.params['boundary'].strip('"') 188 189 if not re.match("^[ -~]{0,200}[!-~]$", ib): 190 raise ValueError('Invalid boundary in multipart form: %r' % (ib,)) 191 192 ib = ('--' + ib).encode('ascii') 193 194 # Find the first marker 195 while True: 196 b = entity.readline() 197 if not b: 198 return 199 200 b = b.strip() 201 if b == ib: 202 break 203 204 # Read all parts 205 while True: 206 part = entity.part_class.from_fp(entity.fp, ib) 207 entity.parts.append(part) 208 part.process() 209 if part.fp.done: 210 break
211
212 -def process_multipart_form_data(entity):
213 """Read all multipart/form-data parts into entity.parts or entity.params.""" 214 process_multipart(entity) 215 216 kept_parts = [] 217 for part in entity.parts: 218 if part.name is None: 219 kept_parts.append(part) 220 else: 221 if part.filename is None: 222 # It's a regular field 223 value = part.fullvalue() 224 else: 225 # It's a file upload. Retain the whole part so consumer code 226 # has access to its .file and .filename attributes. 227 value = part 228 229 if part.name in entity.params: 230 if not isinstance(entity.params[part.name], list): 231 entity.params[part.name] = [entity.params[part.name]] 232 entity.params[part.name].append(value) 233 else: 234 entity.params[part.name] = value 235 236 entity.parts = kept_parts
237
238 -def _old_process_multipart(entity):
239 """The behavior of 3.2 and lower. Deprecated and will be changed in 3.3.""" 240 process_multipart(entity) 241 242 params = entity.params 243 244 for part in entity.parts: 245 if part.name is None: 246 key = ntou('parts') 247 else: 248 key = part.name 249 250 if part.filename is None: 251 # It's a regular field 252 value = part.fullvalue() 253 else: 254 # It's a file upload. Retain the whole part so consumer code 255 # has access to its .file and .filename attributes. 256 value = part 257 258 if key in params: 259 if not isinstance(params[key], list): 260 params[key] = [params[key]] 261 params[key].append(value) 262 else: 263 params[key] = value
264 265 266 267 # --------------------------------- Entities --------------------------------- # 268 269
270 -class Entity(object):
271 """An HTTP request body, or MIME multipart body. 272 273 This class collects information about the HTTP request entity. When a 274 given entity is of MIME type "multipart", each part is parsed into its own 275 Entity instance, and the set of parts stored in 276 :attr:`entity.parts<cherrypy._cpreqbody.Entity.parts>`. 277 278 Between the ``before_request_body`` and ``before_handler`` tools, CherryPy 279 tries to process the request body (if any) by calling 280 :func:`request.body.process<cherrypy._cpreqbody.RequestBody.process`. 281 This uses the ``content_type`` of the Entity to look up a suitable processor 282 in :attr:`Entity.processors<cherrypy._cpreqbody.Entity.processors>`, a dict. 283 If a matching processor cannot be found for the complete Content-Type, 284 it tries again using the major type. For example, if a request with an 285 entity of type "image/jpeg" arrives, but no processor can be found for 286 that complete type, then one is sought for the major type "image". If a 287 processor is still not found, then the 288 :func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>` method of the 289 Entity is called (which does nothing by default; you can override this too). 290 291 CherryPy includes processors for the "application/x-www-form-urlencoded" 292 type, the "multipart/form-data" type, and the "multipart" major type. 293 CherryPy 3.2 processes these types almost exactly as older versions. 294 Parts are passed as arguments to the page handler using their 295 ``Content-Disposition.name`` if given, otherwise in a generic "parts" 296 argument. Each such part is either a string, or the 297 :class:`Part<cherrypy._cpreqbody.Part>` itself if it's a file. (In this 298 case it will have ``file`` and ``filename`` attributes, or possibly a 299 ``value`` attribute). Each Part is itself a subclass of 300 Entity, and has its own ``process`` method and ``processors`` dict. 301 302 There is a separate processor for the "multipart" major type which is more 303 flexible, and simply stores all multipart parts in 304 :attr:`request.body.parts<cherrypy._cpreqbody.Entity.parts>`. You can 305 enable it with:: 306 307 cherrypy.request.body.processors['multipart'] = _cpreqbody.process_multipart 308 309 in an ``on_start_resource`` tool. 310 """ 311 312 # http://tools.ietf.org/html/rfc2046#section-4.1.2: 313 # "The default character set, which must be assumed in the 314 # absence of a charset parameter, is US-ASCII." 315 # However, many browsers send data in utf-8 with no charset. 316 attempt_charsets = ['utf-8'] 317 """A list of strings, each of which should be a known encoding. 318 319 When the Content-Type of the request body warrants it, each of the given 320 encodings will be tried in order. The first one to successfully decode the 321 entity without raising an error is stored as 322 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults 323 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by 324 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_), 325 but ``['us-ascii', 'utf-8']`` for multipart parts. 326 """ 327 328 charset = None 329 """The successful decoding; see "attempt_charsets" above.""" 330 331 content_type = None 332 """The value of the Content-Type request header. 333 334 If the Entity is part of a multipart payload, this will be the Content-Type 335 given in the MIME headers for this part. 336 """ 337 338 default_content_type = 'application/x-www-form-urlencoded' 339 """This defines a default ``Content-Type`` to use if no Content-Type header 340 is given. The empty string is used for RequestBody, which results in the 341 request body not being read or parsed at all. This is by design; a missing 342 ``Content-Type`` header in the HTTP request entity is an error at best, 343 and a security hole at worst. For multipart parts, however, the MIME spec 344 declares that a part with no Content-Type defaults to "text/plain" 345 (see :class:`Part<cherrypy._cpreqbody.Part>`). 346 """ 347 348 filename = None 349 """The ``Content-Disposition.filename`` header, if available.""" 350 351 fp = None 352 """The readable socket file object.""" 353 354 headers = None 355 """A dict of request/multipart header names and values. 356 357 This is a copy of the ``request.headers`` for the ``request.body``; 358 for multipart parts, it is the set of headers for that part. 359 """ 360 361 length = None 362 """The value of the ``Content-Length`` header, if provided.""" 363 364 name = None 365 """The "name" parameter of the ``Content-Disposition`` header, if any.""" 366 367 params = None 368 """ 369 If the request Content-Type is 'application/x-www-form-urlencoded' or 370 multipart, this will be a dict of the params pulled from the entity 371 body; that is, it will be the portion of request.params that come 372 from the message body (sometimes called "POST params", although they 373 can be sent with various HTTP method verbs). This value is set between 374 the 'before_request_body' and 'before_handler' hooks (assuming that 375 process_request_body is True).""" 376 377 processors = {'application/x-www-form-urlencoded': process_urlencoded, 378 'multipart/form-data': process_multipart_form_data, 379 'multipart': process_multipart, 380 } 381 """A dict of Content-Type names to processor methods.""" 382 383 parts = None 384 """A list of Part instances if ``Content-Type`` is of major type "multipart".""" 385 386 part_class = None 387 """The class used for multipart parts. 388 389 You can replace this with custom subclasses to alter the processing of 390 multipart parts. 391 """ 392
393 - def __init__(self, fp, headers, params=None, parts=None):
394 # Make an instance-specific copy of the class processors 395 # so Tools, etc. can replace them per-request. 396 self.processors = self.processors.copy() 397 398 self.fp = fp 399 self.headers = headers 400 401 if params is None: 402 params = {} 403 self.params = params 404 405 if parts is None: 406 parts = [] 407 self.parts = parts 408 409 # Content-Type 410 self.content_type = headers.elements('Content-Type') 411 if self.content_type: 412 self.content_type = self.content_type[0] 413 else: 414 self.content_type = httputil.HeaderElement.from_str( 415 self.default_content_type) 416 417 # Copy the class 'attempt_charsets', prepending any Content-Type charset 418 dec = self.content_type.params.get("charset", None) 419 if dec: 420 self.attempt_charsets = [dec] + [c for c in self.attempt_charsets 421 if c != dec] 422 else: 423 self.attempt_charsets = self.attempt_charsets[:] 424 425 # Length 426 self.length = None 427 clen = headers.get('Content-Length', None) 428 # If Transfer-Encoding is 'chunked', ignore any Content-Length. 429 if clen is not None and 'chunked' not in headers.get('Transfer-Encoding', ''): 430 try: 431 self.length = int(clen) 432 except ValueError: 433 pass 434 435 # Content-Disposition 436 self.name = None 437 self.filename = None 438 disp = headers.elements('Content-Disposition') 439 if disp: 440 disp = disp[0] 441 if 'name' in disp.params: 442 self.name = disp.params['name'] 443 if self.name.startswith('"') and self.name.endswith('"'): 444 self.name = self.name[1:-1] 445 if 'filename' in disp.params: 446 self.filename = disp.params['filename'] 447 if self.filename.startswith('"') and self.filename.endswith('"'): 448 self.filename = self.filename[1:-1]
449 450 # The 'type' attribute is deprecated in 3.2; remove it in 3.3. 451 type = property(lambda self: self.content_type, 452 doc="""A deprecated alias for :attr:`content_type<cherrypy._cpreqbody.Entity.content_type>`.""") 453
454 - def read(self, size=None, fp_out=None):
455 return self.fp.read(size, fp_out)
456
457 - def readline(self, size=None):
458 return self.fp.readline(size)
459
460 - def readlines(self, sizehint=None):
461 return self.fp.readlines(sizehint)
462
463 - def __iter__(self):
464 return self
465
466 - def __next__(self):
467 line = self.readline() 468 if not line: 469 raise StopIteration 470 return line
471
472 - def next(self):
473 return self.__next__()
474
475 - def read_into_file(self, fp_out=None):
476 """Read the request body into fp_out (or make_file() if None). Return fp_out.""" 477 if fp_out is None: 478 fp_out = self.make_file() 479 self.read(fp_out=fp_out) 480 return fp_out
481
482 - def make_file(self):
483 """Return a file-like object into which the request body will be read. 484 485 By default, this will return a TemporaryFile. Override as needed. 486 See also :attr:`cherrypy._cpreqbody.Part.maxrambytes`.""" 487 return tempfile.TemporaryFile()
488
489 - def fullvalue(self):
490 """Return this entity as a string, whether stored in a file or not.""" 491 if self.file: 492 # It was stored in a tempfile. Read it. 493 self.file.seek(0) 494 value = self.file.read() 495 self.file.seek(0) 496 else: 497 value = self.value 498 return value
499
500 - def process(self):
501 """Execute the best-match processor for the given media type.""" 502 proc = None 503 ct = self.content_type.value 504 try: 505 proc = self.processors[ct] 506 except KeyError: 507 toptype = ct.split('/', 1)[0] 508 try: 509 proc = self.processors[toptype] 510 except KeyError: 511 pass 512 if proc is None: 513 self.default_proc() 514 else: 515 proc(self)
516
517 - def default_proc(self):
518 """Called if a more-specific processor is not found for the ``Content-Type``.""" 519 # Leave the fp alone for someone else to read. This works fine 520 # for request.body, but the Part subclasses need to override this 521 # so they can move on to the next part. 522 pass
523 524
525 -class Part(Entity):
526 """A MIME part entity, part of a multipart entity.""" 527 528 # "The default character set, which must be assumed in the absence of a 529 # charset parameter, is US-ASCII." 530 attempt_charsets = ['us-ascii', 'utf-8'] 531 """A list of strings, each of which should be a known encoding. 532 533 When the Content-Type of the request body warrants it, each of the given 534 encodings will be tried in order. The first one to successfully decode the 535 entity without raising an error is stored as 536 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults 537 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by 538 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_), 539 but ``['us-ascii', 'utf-8']`` for multipart parts. 540 """ 541 542 boundary = None 543 """The MIME multipart boundary.""" 544 545 default_content_type = 'text/plain' 546 """This defines a default ``Content-Type`` to use if no Content-Type header 547 is given. The empty string is used for RequestBody, which results in the 548 request body not being read or parsed at all. This is by design; a missing 549 ``Content-Type`` header in the HTTP request entity is an error at best, 550 and a security hole at worst. For multipart parts, however (this class), 551 the MIME spec declares that a part with no Content-Type defaults to 552 "text/plain". 553 """ 554 555 # This is the default in stdlib cgi. We may want to increase it. 556 maxrambytes = 1000 557 """The threshold of bytes after which point the ``Part`` will store its data 558 in a file (generated by :func:`make_file<cherrypy._cprequest.Entity.make_file>`) 559 instead of a string. Defaults to 1000, just like the :mod:`cgi` module in 560 Python's standard library. 561 """ 562
563 - def __init__(self, fp, headers, boundary):
564 Entity.__init__(self, fp, headers) 565 self.boundary = boundary 566 self.file = None 567 self.value = None
568
569 - def from_fp(cls, fp, boundary):
570 headers = cls.read_headers(fp) 571 return cls(fp, headers, boundary)
572 from_fp = classmethod(from_fp) 573
574 - def read_headers(cls, fp):
575 headers = httputil.HeaderMap() 576 while True: 577 line = fp.readline() 578 if not line: 579 # No more data--illegal end of headers 580 raise EOFError("Illegal end of headers.") 581 582 if line == ntob('\r\n'): 583 # Normal end of headers 584 break 585 if not line.endswith(ntob('\r\n')): 586 raise ValueError("MIME requires CRLF terminators: %r" % line) 587 588 if line[0] in ntob(' \t'): 589 # It's a continuation line. 590 v = line.strip().decode('ISO-8859-1') 591 else: 592 k, v = line.split(ntob(":"), 1) 593 k = k.strip().decode('ISO-8859-1') 594 v = v.strip().decode('ISO-8859-1') 595 596 existing = headers.get(k) 597 if existing: 598 v = ", ".join((existing, v)) 599 headers[k] = v 600 601 return headers
602 read_headers = classmethod(read_headers) 603
604 - def read_lines_to_boundary(self, fp_out=None):
605 """Read bytes from self.fp and return or write them to a file. 606 607 If the 'fp_out' argument is None (the default), all bytes read are 608 returned in a single byte string. 609 610 If the 'fp_out' argument is not None, it must be a file-like object that 611 supports the 'write' method; all bytes read will be written to the fp, 612 and that fp is returned. 613 """ 614 endmarker = self.boundary + ntob("--") 615 delim = ntob("") 616 prev_lf = True 617 lines = [] 618 seen = 0 619 while True: 620 line = self.fp.readline(1<<16) 621 if not line: 622 raise EOFError("Illegal end of multipart body.") 623 if line.startswith(ntob("--")) and prev_lf: 624 strippedline = line.strip() 625 if strippedline == self.boundary: 626 break 627 if strippedline == endmarker: 628 self.fp.finish() 629 break 630 631 line = delim + line 632 633 if line.endswith(ntob("\r\n")): 634 delim = ntob("\r\n") 635 line = line[:-2] 636 prev_lf = True 637 elif line.endswith(ntob("\n")): 638 delim = ntob("\n") 639 line = line[:-1] 640 prev_lf = True 641 else: 642 delim = ntob("") 643 prev_lf = False 644 645 if fp_out is None: 646 lines.append(line) 647 seen += len(line) 648 if seen > self.maxrambytes: 649 fp_out = self.make_file() 650 for line in lines: 651 fp_out.write(line) 652 else: 653 fp_out.write(line) 654 655 if fp_out is None: 656 result = ntob('').join(lines) 657 for charset in self.attempt_charsets: 658 try: 659 result = result.decode(charset) 660 except UnicodeDecodeError: 661 pass 662 else: 663 self.charset = charset 664 return result 665 else: 666 raise cherrypy.HTTPError( 667 400, "The request entity could not be decoded. The following " 668 "charsets were attempted: %s" % repr(self.attempt_charsets)) 669 else: 670 fp_out.seek(0) 671 return fp_out
672
673 - def default_proc(self):
674 """Called if a more-specific processor is not found for the ``Content-Type``.""" 675 if self.filename: 676 # Always read into a file if a .filename was given. 677 self.file = self.read_into_file() 678 else: 679 result = self.read_lines_to_boundary() 680 if isinstance(result, basestring): 681 self.value = result 682 else: 683 self.file = result
684
685 - def read_into_file(self, fp_out=None):
686 """Read the request body into fp_out (or make_file() if None). Return fp_out.""" 687 if fp_out is None: 688 fp_out = self.make_file() 689 self.read_lines_to_boundary(fp_out=fp_out) 690 return fp_out
691 692 Entity.part_class = Part 693 694 try: 695 inf = float('inf') 696 except ValueError: 697 # Python 2.4 and lower
698 - class Infinity(object):
699 - def __cmp__(self, other):
700 return 1
701 - def __sub__(self, other):
702 return self
703 inf = Infinity() 704 705 706 comma_separated_headers = ['Accept', 'Accept-Charset', 'Accept-Encoding', 707 'Accept-Language', 'Accept-Ranges', 'Allow', 'Cache-Control', 'Connection', 708 'Content-Encoding', 'Content-Language', 'Expect', 'If-Match', 709 'If-None-Match', 'Pragma', 'Proxy-Authenticate', 'Te', 'Trailer', 710 'Transfer-Encoding', 'Upgrade', 'Vary', 'Via', 'Warning', 'Www-Authenticate'] 711 712
713 -class SizedReader:
714
715 - def __init__(self, fp, length, maxbytes, bufsize=DEFAULT_BUFFER_SIZE, has_trailers=False):
716 # Wrap our fp in a buffer so peek() works 717 self.fp = fp 718 self.length = length 719 self.maxbytes = maxbytes 720 self.buffer = ntob('') 721 self.bufsize = bufsize 722 self.bytes_read = 0 723 self.done = False 724 self.has_trailers = has_trailers
725
726 - def read(self, size=None, fp_out=None):
727 """Read bytes from the request body and return or write them to a file. 728 729 A number of bytes less than or equal to the 'size' argument are read 730 off the socket. The actual number of bytes read are tracked in 731 self.bytes_read. The number may be smaller than 'size' when 1) the 732 client sends fewer bytes, 2) the 'Content-Length' request header 733 specifies fewer bytes than requested, or 3) the number of bytes read 734 exceeds self.maxbytes (in which case, 413 is raised). 735 736 If the 'fp_out' argument is None (the default), all bytes read are 737 returned in a single byte string. 738 739 If the 'fp_out' argument is not None, it must be a file-like object that 740 supports the 'write' method; all bytes read will be written to the fp, 741 and None is returned. 742 """ 743 744 if self.length is None: 745 if size is None: 746 remaining = inf 747 else: 748 remaining = size 749 else: 750 remaining = self.length - self.bytes_read 751 if size and size < remaining: 752 remaining = size 753 if remaining == 0: 754 self.finish() 755 if fp_out is None: 756 return ntob('') 757 else: 758 return None 759 760 chunks = [] 761 762 # Read bytes from the buffer. 763 if self.buffer: 764 if remaining is inf: 765 data = self.buffer 766 self.buffer = ntob('') 767 else: 768 data = self.buffer[:remaining] 769 self.buffer = self.buffer[remaining:] 770 datalen = len(data) 771 remaining -= datalen 772 773 # Check lengths. 774 self.bytes_read += datalen 775 if self.maxbytes and self.bytes_read > self.maxbytes: 776 raise cherrypy.HTTPError(413) 777 778 # Store the data. 779 if fp_out is None: 780 chunks.append(data) 781 else: 782 fp_out.write(data) 783 784 # Read bytes from the socket. 785 while remaining > 0: 786 chunksize = min(remaining, self.bufsize) 787 try: 788 data = self.fp.read(chunksize) 789 except Exception: 790 e = sys.exc_info()[1] 791 if e.__class__.__name__ == 'MaxSizeExceeded': 792 # Post data is too big 793 raise cherrypy.HTTPError( 794 413, "Maximum request length: %r" % e.args[1]) 795 else: 796 raise 797 if not data: 798 self.finish() 799 break 800 datalen = len(data) 801 remaining -= datalen 802 803 # Check lengths. 804 self.bytes_read += datalen 805 if self.maxbytes and self.bytes_read > self.maxbytes: 806 raise cherrypy.HTTPError(413) 807 808 # Store the data. 809 if fp_out is None: 810 chunks.append(data) 811 else: 812 fp_out.write(data) 813 814 if fp_out is None: 815 return ntob('').join(chunks)
816
817 - def readline(self, size=None):
818 """Read a line from the request body and return it.""" 819 chunks = [] 820 while size is None or size > 0: 821 chunksize = self.bufsize 822 if size is not None and size < self.bufsize: 823 chunksize = size 824 data = self.read(chunksize) 825 if not data: 826 break 827 pos = data.find(ntob('\n')) + 1 828 if pos: 829 chunks.append(data[:pos]) 830 remainder = data[pos:] 831 self.buffer += remainder 832 self.bytes_read -= len(remainder) 833 break 834 else: 835 chunks.append(data) 836 return ntob('').join(chunks)
837
838 - def readlines(self, sizehint=None):
839 """Read lines from the request body and return them.""" 840 if self.length is not None: 841 if sizehint is None: 842 sizehint = self.length - self.bytes_read 843 else: 844 sizehint = min(sizehint, self.length - self.bytes_read) 845 846 lines = [] 847 seen = 0 848 while True: 849 line = self.readline() 850 if not line: 851 break 852 lines.append(line) 853 seen += len(line) 854 if seen >= sizehint: 855 break 856 return lines
857
858 - def finish(self):
859 self.done = True 860 if self.has_trailers and hasattr(self.fp, 'read_trailer_lines'): 861 self.trailers = {} 862 863 try: 864 for line in self.fp.read_trailer_lines(): 865 if line[0] in ntob(' \t'): 866 # It's a continuation line. 867 v = line.strip() 868 else: 869 try: 870 k, v = line.split(ntob(":"), 1) 871 except ValueError: 872 raise ValueError("Illegal header line.") 873 k = k.strip().title() 874 v = v.strip() 875 876 if k in comma_separated_headers: 877 existing = self.trailers.get(envname) 878 if existing: 879 v = ntob(", ").join((existing, v)) 880 self.trailers[k] = v 881 except Exception: 882 e = sys.exc_info()[1] 883 if e.__class__.__name__ == 'MaxSizeExceeded': 884 # Post data is too big 885 raise cherrypy.HTTPError( 886 413, "Maximum request length: %r" % e.args[1]) 887 else: 888 raise
889 890
891 -class RequestBody(Entity):
892 """The entity of the HTTP request.""" 893 894 bufsize = 8 * 1024 895 """The buffer size used when reading the socket.""" 896 897 # Don't parse the request body at all if the client didn't provide 898 # a Content-Type header. See http://www.cherrypy.org/ticket/790 899 default_content_type = '' 900 """This defines a default ``Content-Type`` to use if no Content-Type header 901 is given. The empty string is used for RequestBody, which results in the 902 request body not being read or parsed at all. This is by design; a missing 903 ``Content-Type`` header in the HTTP request entity is an error at best, 904 and a security hole at worst. For multipart parts, however, the MIME spec 905 declares that a part with no Content-Type defaults to "text/plain" 906 (see :class:`Part<cherrypy._cpreqbody.Part>`). 907 """ 908 909 maxbytes = None 910 """Raise ``MaxSizeExceeded`` if more bytes than this are read from the socket.""" 911
912 - def __init__(self, fp, headers, params=None, request_params=None):
913 Entity.__init__(self, fp, headers, params) 914 915 # http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1 916 # When no explicit charset parameter is provided by the 917 # sender, media subtypes of the "text" type are defined 918 # to have a default charset value of "ISO-8859-1" when 919 # received via HTTP. 920 if self.content_type.value.startswith('text/'): 921 for c in ('ISO-8859-1', 'iso-8859-1', 'Latin-1', 'latin-1'): 922 if c in self.attempt_charsets: 923 break 924 else: 925 self.attempt_charsets.append('ISO-8859-1') 926 927 # Temporary fix while deprecating passing .parts as .params. 928 self.processors['multipart'] = _old_process_multipart 929 930 if request_params is None: 931 request_params = {} 932 self.request_params = request_params
933
934 - def process(self):
935 """Process the request entity based on its Content-Type.""" 936 # "The presence of a message-body in a request is signaled by the 937 # inclusion of a Content-Length or Transfer-Encoding header field in 938 # the request's message-headers." 939 # It is possible to send a POST request with no body, for example; 940 # however, app developers are responsible in that case to set 941 # cherrypy.request.process_body to False so this method isn't called. 942 h = cherrypy.serving.request.headers 943 if 'Content-Length' not in h and 'Transfer-Encoding' not in h: 944 raise cherrypy.HTTPError(411) 945 946 self.fp = SizedReader(self.fp, self.length, 947 self.maxbytes, bufsize=self.bufsize, 948 has_trailers='Trailer' in h) 949 super(RequestBody, self).process() 950 951 # Body params should also be a part of the request_params 952 # add them in here. 953 request_params = self.request_params 954 for key, value in self.params.items(): 955 # Python 2 only: keyword arguments must be byte strings (type 'str'). 956 if sys.version_info < (3, 0): 957 if isinstance(key, unicode): 958 key = key.encode('ISO-8859-1') 959 960 if key in request_params: 961 if not isinstance(request_params[key], list): 962 request_params[key] = [request_params[key]] 963 request_params[key].append(value) 964 else: 965 request_params[key] = value
966