in

Lexicalization of the Prefix bi- and the <Bi> Bound Base

The English morpheme <Bi> denoting “two, twice, both, double” comes from the Latin prefix bi-. The Latin bi- comes from the adverb bis meaning “twice, two times.” The adverb bis comes from the numeral duo meaning “two.” The bound bases <Du> and <Di> and the free base <Two>, all denoting “two,” are related to the morpheme <Bi>. <Du> also comes from Latin duo. <Di> comes from Ancient Greek di-, which is a shortened form of the adverb dis meaning “twice, doubly.” <Two> is Germanic from Old English twā.

(Note that English also has a <Bi> bound base denoting “life” as in biome and biology from Ancient Greek bíos, <Di> bound base denoting “day” as in dial and dianthus from Latin diēs, and a bound base <Du> denoting “course, channel” as in conduit from Old French –duit from Latin ductus. Just like words, morphemes can have homographs.)

What type of morpheme is the <Bi> denoting “two”? Word after word provides evidence that English bi- is a prefix. For example:

<bi + Ann + u + al → biannual> “two times a year”
<bi + Colo(u)r + ed → bicolored> “of two colors”
<bi + Cycl(e) → bicycle> “two wheels”
<bi + Fold → bifold> “two folds”
<bi + Fer(e) + ous → biferous> “bearing fruit or flowers two times a year”
<bi + Furc + ate → bifurcate> “having two forks”
<bi + Gam(e) + y → bigamy> “two marriages”
<bi + Helice + al → bihelical> “having two helices”
<bi + Juge + ate → bijugate> “having two pairs”
<bi + Labi + al → bilabial> “having two lips”
<bi + Morph → bimorph> “two forms”
<bi + Node → binode> “two nodes”
<bi + Ped(e) → biped> “two feet”
<bi + Pole + (a)r → bipolar> “having two poles”
<bi + Race + i + al → biracial> “having two races”
<bi + Sect → bisect> “two cuts”
<bi + Valve → bivalve> “two valves”
<bi + Zone + al → bizonal> “having two zones”

In all the words above, the morpheme <Bi> attaches to another base. Every word consists of at least one base. In some instances, the base is a free base. A free base is a base that can form a word without requiring another morpheme as in <Pole>, <Race>, and <Zone>, which form the base of the base words pole, race, and zone. In other cases, the base is a bound base. A bound base requires at least one other morpheme to form a word as in <Furc>, <Helice>, and <Labi>.

I could go on and on with hundreds of examples in which the morpheme <Bi> attaches to a base to form a word. Each example provides evidence that <Bi> is a prefix.

However, I recently found myself investigating the word gypsisol, which is a type of soil that contains gypsum.

<Gyps + um → gypsum>
<Gyps + i + Sol(e) → gypsisol>

The <Gyps> base ultimately comes from Ancient Greek gýpsos and denotes “mineral consisting of hydrated calcium sulphate, chalk, cement.”

While looking for additional morphological relatives with the <Gyps> base, I stumbled on the word gypsobioside. A gypsobioside is a type of steroid glycoside, which is a molecule composed of a sugar unit and a non-sugar unit linked together by a glycosidic bond. The exact chemistry is beyond me, but I understand the basic idea based on the morphology of the chemical terms. I initially found the word in Wiktionary. To confirm the word is an actual word in use and not just a posited form, I used Google Scholar to search for attested uses. I also found the word gypsobioside in the published article “Glycosides of Erysimum” from 1967 in Chemistry of Natural Compounds and in the Spectroscopic Data of Steroid Glycosides: Volume 6 from 2006, among other scientific sources.

Looking closer at gypsobioside, I discovered the word bioside.

<Gyps + o + Bioside → gypsobioside>

I found the word bioside in the PubChem Compound Summary from the National Center for Biotechnology Information and in a number of journal articles including “Flavonone biosides of Acinos thymoides” from 1966 in Chemistry of Natural Compounds. A bioside is a steroidal glycoside, which, interestingly, is found in horseradish. The <ide> of bioside is the chemical suffix -ide that comes from English oxide, which comes from French -ide, which comes from French acide, which comes from Latin acidus, which comes from acēre + -idus. In chemistry, the suffix is used for naming simple and related compounds. For example:

<Glyce + ose → glycose>
<Glyce + ose + ide → glycoside>
<Galact + ose → galactose>
<Galact + ose + ide → galactoside>
<Lact + ose → lactose>
<Lact + ose + ide → lactoside>

The -ose suffix is also a chemical suffix that primarily comes from English glucose, which comes from French glucose, which comes from Ancient Greek gleûkos meaning “sweetness, sweet wine,” which comes from glykýs. (A second source is French -ose from Latin -ōsus as in cellulose.) The -ose suffix is used for naming saccharides (or simple sugars).

Investigating bioside further, I see a possible -ide suffix and a possible -ose suffix. The Oxford English Dictionary tells me that biose comes from bi- + -ose and means “disaccharide.” According to the OED, the first attested use is from 1887 in Journal of American Chemical Society. Again, to confirm use of the word, I used Google Scholar and found the hyphenated lacto-N-biose and galacto-N-biose in the 2025 “Crystal Structure of Glycoside Hydrolase Family 20 Lacto-N-biosidase from Soil Bacterium Streptomyces sp. Strain 142” in Journal of Applied Glycoscience.

Looking at the etymology from the OED, I immediately see a glaring problem: Every word must consist of at least one base. The word biose cannot be the prefix bi- and the suffix -ose because then the word would lack a base. A prefix cannot attach directly to a suffix. The -ose is an established suffix with an identifiable etymology and clear meaning. Thus, the base must be the morpheme <Bi> denoting “two.” That the <Bi> means “two” is further supported by the meaning of the word: a disaccharide (<Di + Sacchare + ide>), or a sugar consisting of two monosaccharides.

The word biose is evidence of the lexicalization of the bi- prefix into the bound base <Bi>.

<Bi + ose → biose>
<Bi + ose + ide → bioside>

Because biose is evidence that the <Bi> morpheme is a base, I must also update my previous word sums.

<Bi + Ann + u + al → biannual>
<Bi + Colo(u)r + ed → bicolored>
<Bi + Cycl(e) → bicycle>
<Bi + Fold → bifold>
<Bi + Fer(e) + ous → biferous>
<Bi + Furc + ate → bifurcate>
<Bi + Gam(e) + y → bigamy>
<Bi + Helice + al → bihelical>
<Bi + Juge + ate → bijugate>
<Bi + Labi + al → bilabial>
<Bi + Morph → bimorph>
<Bi + Node → binode>
<Bi + Ped(e) → biped>
<Bi + Pole + (a)r → bipolar>
<Bi + Race + i + al → biracial>
<Bi + Sect → bisect>
<Bi + Valve → bivalve>
<Bi + Zone + al → bizonal>

The morpheme <Bi> has lived two lives, first as the prefix bi- from the Latin prefix bi- and then as the bound base <Bi>. Although the lexicalization is quite recent, dating from 1887, the lexicalization occurred nonetheless. The morpheme <Bi> is a bound base with <Bi + ose → biose> as the evidence.

Black and white text reading "Linguistics Girl The Linguistic Librarian" on a hot pink background.

Structured Word Inquiry of ‘Environment’

Black and white text reading "Linguistics Girl The Linguistic Librarian" on a hot pink background.

Structured Word Inquiry of ‘Sustainability’