Spending place in coding table for capital letters is inadmissible spending.
Let's all letters are lower case:
when there is an own name or beginning of sentence,
one prefix-byte
before a word is enough to specify, that first letter is upper-case.
We shall name this prefix-byte as mark "own name"
.
It works so:
anna -> Anna.
It's necessary to tell the same about abbreviations.
One prefix-byte
before a word is enough to specify,
that all letters to symbol "blank" are upper-case.
We shall name this byte as mark "abbreviation"
.
It works so:
uno -> UNO.
Term mark designates any of these two prefix-bytes. User himself puts them by pressing keys "Shift" and "Caps Lock".
Now comparison of various variants of spelling (all letters are lower-case, first letter is upper-case, all letters are upper-case) is reduced to comparison in one variant of spelling (all letters are lower-case) at searching of similar word. Widespread error is equating of designation of a letters (coding) and their graphic images (font). It's absolutely different things. Besides that, marks will help search-servers to find own names and abbreviations on languages and alphabets without upper-case letters.
Indexes are often used in our life: such social phenomenon, as science, is impossible without them; such all-embracing industry, as programming, is inconceivable without them. Poor, archaic way of writing indexes amazes in textual files - it's necessary to correct.
Let's enter terms.
Limits are above-string and below-string records,
for example, limits of integration and summation.
Makeweights are limits and indexes;
there are only four kinds of makeweights:
top and bottom limits, top and bottom indexes.
Base is a word, for which makeweights are written.
Open symbols
are control symbols, which specify, that
there are makeweight after it;
there exist four opening symbols - as much as number of kinds of makeweights:
bottom limit
,
top limit
,
bottom index
,
top index
.
Whereas makeweights are written together with base and
come to end at the nearest blank, new symbol space
is used for blanks in makeweights
(we shall name blank
between words as blank).
Limits can have indexes.
Indexes can have diacritical marks (which is limits essentially).
Thus opening symbol of limit does not mean end of an index and vice versa.
Base can have set of indexes with sub-indexes.
If it's necessary to write other sub-index of the same index after sub-index of some index,
then it's necessary to return to the previous level.
Let's enter control symbol "return"
to return to the previous level.
It works so:
aij -> .
So, six new control symbols and six new keys on keyboard are necessary for writing limits and indexes. Term herald designates any of these six symbols.
Paragraph can contain phrases,
which need to be noted by other colour, size, underlining, weight or oblique drafting -
we shall name they as fractions
(fractions can not be enclosed).
Besides that, it's hardly to imagine modern text without hyper-links.
Thus we come to special signs for text, like "<" and ">" in
html.
Let's control symbols
beginning of region
and middle of region
are before each fraction,
and control symbol end of region
is after each fraction.
And special binary (not textual) structures
byte-predictor and fractional record
are between "beginning of region" and "middle of region".
text BytePredictor Fractional record fraction text
Byte-predictor consist of 6 bits: 3 of them specify, that parameters "color", "fontsize", "number" present in fractional records ("number" is identifier of hyper-link; this identifier is transfered by hardware to program, when user click fraction), and next 3 bits specify, that fraction is underlined, bold, oblique.
Fractional record consist of fields "color", "fontsize", "number" and has variable size: fields "color", "fontsize", "number" can be in it or not be - it's depends of value of identical bits of byte-predictor (if they are equal to one, then appropriate field exists in fractional record). If these fields exist, then they specify colour, font size and number of fraction.
singularitydesignates any of three new symbols, and term
regiondesignates whole construction:
BytePredictor color fontsize numberfraction
Text editors put mark "own name" after each point, i.e. after end of sentence, thus keys "Shift" and "CapsLock" are used seldom and should be brought into periphery of keyboard. Besides that, it's enough one key "Shift" (without second, twin to it).
Keys "Del" and "BkSp" don't delete control symbols in text editor (except for symbol of transition to new line "/n"), until there will be two unprintable symbols in succession - if unprintable symbol is only one, then these keys jump through it at first, and delete a printed symbol only after that. Thus jumps are executed:
abcd | abd | |
abcd | acd | |
abcde | abce | |
abcd | acd | |
abcd | abcd |
There is no need to keep key "Shift" during pressing other key - it's enough to press before other key. There is no need to press key "CapsLock" second time after end of abbreviation - mode "CapsLock" switch off independently at termination of word. Erroneous pressing of these keys is cancelled by their repeated pressing.
Number has view [-|+]x[(.x)][[(|]e)|(*10)[-|+]x], where "x" is sequence of digits; unessential components are written in square brackets; components, which are present only simultaneously, are written in parentheses; alternative components are listed through sign "|".
Movement of cursor in editors "as is" (MS Word, etc) differs from movement in textual file: keys-arrows "up", "down" move cursor between indexes, limits and base line (but in textual file - always between base lines).
Tags "sub" and "sup" for creation indexes are not necessary in HTML-documents. Symbol "space" can be used not in makeweights, but in base line as . Values of attributes of html/xml-documents should be inside opening and closing double inverted commas, because string, containing double inverted commas, can be value of attribute (for example, request into a database with a textual string).
For parser and scanner generators (bison, yacc, byacc; flex). We designate operation, left operand of which is base and right operand is bottom index of this base, as "$b".
Special designations, which are necessary in archaic programming languages for control symbols, are below:
Dmitry Turin, dmitryturin@yandex.ru, PGP