Package gluon :: Module utf8 :: Class Utf8

Class Utf8

object --+        
         |        
basestring --+    
             |    
           str --+
                 |
                Utf8

Class for utf8 string storing and manipulations

The base presupposition of this class usage is: "ALL strings in the application are either of utf-8 or unicode type, even when simple str type is used. UTF-8 is only a "packed" version of unicode, so Utf-8 and unicode strings are interchangeable."

CAUTION! This class is slower than str/unicode! Do NOT use it inside intensive loops. Simply decode string(s) to unicode before loop and encode it back to utf-8 string(s) after intensive calculation.

You can see the benefit of this class in doctests() below

Instance Methods

[hide private]

__repr__(self)
# note that we use raw strings to avoid having to use double back slashes below NOTE! This function is a clone of web2py:gluon.languages.utf_repl() function

source code

__size__(self)
length of utf-8 string in bytes

source code

__contains__(self, other)
y in x

source code

__getitem__(self, index)
x[y]

source code

__getslice__(self, begin, end)
x[i:j]

source code

__add__(self, other)
x+y

source code

__len__(self)
len(x)

source code

__mul__(self, integer)
x*n

source code

__eq__(self, string)
x==y

source code

__ne__(self, string)
x!=y

source code

string

capitalize(self)
Return a copy of the string S with only its first character capitalized.

source code

string

center(self, length)
Return S centered in a string of length width.

source code

string

upper(self)
Return a copy of the string S converted to uppercase.

source code

string

lower(self)
Return a copy of the string S converted to lowercase.

source code

string

title(self)
Return a titlecased version of S, i.e.

source code

int

index(self, string)
Like S.find() but raise ValueError when the substring is not found.

source code

bool

isalnum(self)
Return True if all characters in S are alphanumeric and there is at least one character in S, False otherwise.

source code

bool

isalpha(self)
Return True if all characters in S are alphabetic and there is at least one character in S, False otherwise.

source code

bool

isdigit(self)
Return True if all characters in S are digits and there is at least one character in S, False otherwise.

source code

bool

islower(self)
Return True if all cased characters in S are lowercase and there is at least one cased character in S, False otherwise.

source code

bool

isspace(self)
Return True if all characters in S are whitespace and there is at least one character in S, False otherwise.

source code

bool

istitle(self)
Return True if S is a titlecased string and there is at least one character in S, i.e.

source code

bool

isupper(self)
Return True if all cased characters in S are uppercase and there is at least one cased character in S, False otherwise.

source code

string

zfill(self, length)
Pad a numeric string S with zeros on the left, to fill a field of the specified width.

source code

string

join(self, iter)
Return a string which is the concatenation of the strings in the iterable.

source code

string or unicode

lstrip(self, chars=None)
Return a copy of the string S with leading whitespace removed.

source code

string or unicode

rstrip(self, chars=None)
Return a copy of the string S with trailing whitespace removed.

source code

string or unicode

strip(self, chars=None)
Return a copy of the string S with leading and trailing whitespace removed.

source code

string

swapcase(self)
Return a copy of the string S with uppercase characters converted to lowercase and vice versa.

source code

int

count(self, sub, start=0, end=None)
Return the number of non-overlapping occurrences of substring sub in string S[start:end].

source code

object

decode(self, encoding='utf-8', errors='strict')
Decodes S using the codec registered for encoding. source code

object

encode(self, encoding, errors='strict')
Encodes S using the codec registered for encoding. source code

string

expandtabs(self, tabsize=8)
Return a copy of S where all tab characters are expanded using spaces.

source code

int

find(self, sub, start=None, end=None)
Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end].

source code

string

ljust(self, width, fillchar=' ')
Return S left-justified in a string of length width. source code

(head, sep, tail)

partition(self, sep)
Search for the separator sep in S, and return the part before it, the separator itself, and the part after it.

source code

string

replace(self, old, new, count=-1)
Return a copy of string S with all occurrences of substring old replaced by new.

source code

int

rfind(self, sub, start=None, end=None)
Return the highest index in S where substring sub is found, such that sub is contained within S[start:end].

source code

int

rindex(self, string)
Like S.rfind() but raise ValueError when the substring is not found.

source code

string

rjust(self, width, fillchar=' ')
Return S right-justified in a string of length width. source code

(head, sep, tail)

rpartition(self, sep)
Search for the separator sep in S, starting at the end of S, and return the part before it, the separator itself, and the part after it.

source code

list of strings

rsplit(self, sep=None, maxsplit=-1)
Return a list of the words in the string S, using sep as the delimiter string, starting at the end of the string and working to the front.

source code

list of strings

split(self, sep=None, maxsplit=-1)
Return a list of the words in the string S, using sep as the delimiter string.

source code

list of strings

splitlines(self, keepends=False)
Return a list of the lines in S, breaking at line boundaries.

source code

bool

startswith(self, prefix, start=0, end=None)
Return True if S starts with the specified prefix, False otherwise.

source code

string

translate(self, table, deletechars='')
Return a copy of the string S, where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table, which must be a string of length 256 or None. source code

bool

endswith(self, prefix, start=0, end=None)
Return True if S ends with the specified suffix, False otherwise.

source code

string

format(self, *args, **kwargs)
Return a formatted version of S, using substitutions from args and kwargs.

source code

__mod__(self, right)
x%y

source code

__ge__(self, string)
x>=y

source code

__gt__(self, string)
x>y

source code

__le__(self, string)
x<=y

source code

__lt__(self, string)
x<y

source code

Inherited from str: __format__, __getattribute__, __getnewargs__, __hash__, __rmod__, __rmul__, __sizeof__, __str__

Inherited from str (private): _formatter_field_name_split, _formatter_parser

Inherited from object: __delattr__, __init__, __reduce__, __reduce_ex__, __setattr__, __subclasshook__

Static Methods

[hide private]

a new object with type S, a subtype of T

__new__(cls, content='', codepage='utf-8') source code

Properties

[hide private]

Inherited from object: __class__

Method Details

Class Utf8

__new__(cls, content='', codepage='utf-8') Static Method

__repr__(self) (Representation operator)

__contains__(self, other) (In operator)

__getitem__(self, index) (Indexing operator)

__getslice__(self, begin, end) (Slicling operator)

__add__(self, other) (Addition operator)

__len__(self) (Length operator)

__mul__(self, integer)

__eq__(self, string) (Equality operator)

__ne__(self, string)

capitalize(self)

center(self, length)

upper(self)

lower(self)

title(self)

index(self, string)

isalnum(self)

isalpha(self)

isdigit(self)

islower(self)

isspace(self)

istitle(self)

isupper(self)

zfill(self, length)

join(self, iter)

lstrip(self, chars=None)

rstrip(self, chars=None)

strip(self, chars=None)

swapcase(self)

count(self, sub, start=0, end=None)

decode(self, encoding='utf-8', errors='strict')

encode(self, encoding, errors='strict')

expandtabs(self, tabsize=8)

find(self, sub, start=None, end=None)

ljust(self, width, fillchar=' ')

partition(self, sep)

replace(self, old, new, count=-1)

rfind(self, sub, start=None, end=None)

rindex(self, string)

rjust(self, width, fillchar=' ')

rpartition(self, sep)

rsplit(self, sep=None, maxsplit=-1)

split(self, sep=None, maxsplit=-1)

splitlines(self, keepends=False)

startswith(self, prefix, start=0, end=None)

translate(self, table, deletechars='')

endswith(self, prefix, start=0, end=None)

format(self, *args, **kwargs)

__mod__(self, right)

__ge__(self, string) (Greater-than-or-equals operator)

__gt__(self, string) (Greater-than operator)

__le__(self, string) (Less-than-or-equals operator)

__lt__(self, string) (Less-than operator)

new(cls, content=`''`, codepage=`'utf-8'`)
Static Method

repr(self)
(Representation operator)

contains(self, other)
(In operator)

getitem(self, index)
(Indexing operator)

getslice(self, begin, end)
(Slicling operator)

add(self, other)
(Addition operator)

len(self)
(Length operator)

mul(self, integer)

eq(self, string)
(Equality operator)

ne(self, string)

decode(self, encoding=`'utf-8'`, errors=`'strict'`)

encode(self, encoding, errors=`'strict'`)

ljust(self, width, fillchar=`'` `'`)

rjust(self, width, fillchar=`'` `'`)

translate(self, table, deletechars=`''`)

mod(self, right)

ge(self, string)
(Greater-than-or-equals operator)

gt(self, string)
(Greater-than operator)

le(self, string)
(Less-than-or-equals operator)

lt(self, string)
(Less-than operator)