Determining the Amino Acid Composition of Soybean Proteins Using IR Scanners


Sergey Nizkii 1*, Galina Kodirova 2, Galina Kubankova 3

1 Associate Professor, Candidate of Biological Sciences (equivalent to Ph D in Biology), Federal State Budget Scientific Institution, “All-Russian Scientific Research Institute of Soybean”, Blagoveshchensk, Russia.

2 Candidate of Technical Sciences (equivalent to Ph D in Technical Sciences), Federal State Budget Scientific Institution, “All-Russian Scientific Research Institute of Soybean”, Blagoveshchensk, Russia.

3 Senior Researcher, Federal State Budget Scientific Institution, "All-Russian, Scientific Research Institute of Soybean", Blagoveshchensk, Russia.


*Email: luchezarnaya @



The amino acid composition of plant-based proteins consists of twenty individual amino acids and is expressed as the ratio of each of them to the sum of all (expressed as percentage). Sixteen of twenty amino acids included in the composition of plant-based proteins are most effectively determined on the liquid chromatographers. The technology of high-performance liquid chromatography is to a certain extent costly both in time and in sample preparation, which makes this method unsuitable for mass analysis; for example, for evaluating a breeding material. In this case, the suggested method based on scanning in the near infrared radiation band is more efficient. Despite the fact, that IR-scanners are able to determine a sufficiently large number of components on the basis of one calibration equation, a constant correction is required when it is needed to determine the composition of amino acids and reduce it to a percentage ratio. The options for creating calibration equations (models) for determining the amino acid composition of soybean proteins for computer programs (Nir 42, ISI) which provide the operation of IR-scanners, such as NIR-4250 or FOSS NIRSistem 5000 (FOSS Analytical A/S, Denmark) are considered in the article. It was found that when creating calibration equations, it is most correct to set for each amino acid its mass content (g per 100 g of protein), and not the relative portion (in %), as it has also been done in other methods described in the literature.


Keywords: IR-scanner, chromatography, computer programs, calibration equations, amino acids, soybean.


To determine the content of amino acids in plant proteins, a high-performance liquid chromatography is used that has been replaced automatic amino acid analyzers. Such devices are able to operate continuously for weeks. The duration of one complete analysis depends on the ion-exchange resin being used in the experiment and on the type of elution; and even with the use of modern equipment, the time of the analysis conduction is counted to several hours. Other physicochemical methods being implemented in this regard - such as capillary zone electrophoresis and thin-layer chromatography [1, 2] - are no less time consuming. The duration of such an analysis and the difficulties in conducting sample preparation limit the potential for conducting mass analyses; for example, in the study of collections of varieties and hybrids in the selection process. In order to overcome this drawback, the infrared scanners are often used for mass rapid analyzes, which are also successfully used to determine the total amount of protein, fat, and other economically valuable biochemical indicators of agricultural plants [3, 4].

The goal of research is to develop calibration equations for determining the amino acid composition of proteins in plant products for computer programs that ensures the operation of IR-scanners.


Initially, the liquid chromatography method - an automatic amino acid analyzer manufactured by LKB 4101 (Sweden)- was used in the Federal State Budget Scientific Institution “All-Russian Scientific Research Institute of Soybean” (FSBSI ARSRI) in order to determine the amino acid composition of proteins in seeds and above-ground mass of soybean and wheat [5]. After the sample preparation which is a very time-consuming procedure, the amino acids were separated on a column with a sulfopolystyrene cationite. In six normal hydrochloric acids (HCl) in an autoclave at 105°C, the complete proteins breakdown lasted twelve hours. The time of complete separation on a column was more than three hours [1]. At the exit from the column, photometric sensors and ninhydrin test were being implemented which allowed identifying sixteen basic amino acids. As a result of manual or computer chromatogram processing, the proteins’ amino acid composition was established. Calculations were carried out by determining the peak areas in the chromatogram. The content of each amino acid was expressed in its share (in %) to the sum of all who left the column (100%). At the same time, it did not take into account that the proteins composition contains more than sixteen amino acids. Despite the fact that out of more than two hundred amino acids were found in the nature and described in the scientific literature, only twenty of them are constituents of plant proteins [6]. Unfortunately, not all amino acids can be separated using liquid chromatography. Some of them are being destroyed by hydrolysis, for example by Triptofane and others. Others do not form coloration in the reaction with ninhydrin, for example with Asparagine [6]. The duration of the analysis with both sample preparation and chromatographic separation creates certain difficulties when it is necessary to analyze a large number of samples.

It is well known, that soybean is a high protein agricultural crop and the amino acid composition of proteins is an important indicator for it. This is especially important in terms of the content of the so-called “essential amino acids”, namely amino acids that are not produced in the human body but can be only distracted from food people consume. More than a thousand various combinations (variety samples, hybrids, mutants, etc.) are tested in breeding nurseries of FSBSI ARSRI of Soybean. They are evaluated for a variety of characteristics and indicators, including the amino acid composition of proteins [5, 7].

Currently, the infrared (IR) scanners are widely used as they have been proven to conduct high speed analysis with a fairly simple sample preparation [8, 9]. For example, when analyzing the biochemical composition of seeds, many IR-systems do not provide the grinding of samples at all.

The principle of the IR-scanning method is to measure the intensity of diffusely reflected light at pre-selected wavelengths and also to calculate the content of the being analyzed components using the calibration equations [8, 10]. The reflection spectrum in the near infrared region of the milled soybean grain is shown in Figure 1.


Figure 1: The reflection spectrum in the near IR-region of the milled soya beans (abscissa axis is the inverse value of the reflection coefficient, and the ordinate axis is the wavelength in nm)

Each peak in this spectrum corresponds to a particular chemical indicator (fat, protein, hydrocarbons, amino acids, etc.). Through calibration equations, the computer programs that ensure the operation of IR-scanners, make a comparison of the spectra of the being analyzed sample with the reference spectra generated on the basis of data obtained by chemical means and mathematical description of these spectra (second-order differential equations) [11]. To create a calibration equation, for example, using the ISI program, it is necessary to enter information on the content of a particular component into the database obtained by a chemical method in at least thirty samples. The computer program “decodes” the scanned spectrum of the being analyzed sample and gives the result in a few seconds. The technology allows the possibility to simultaneously analyze more than ten components in one sample with the use of one calibration equation [12, 13].

While creating calibration equations for the analysis of amino acid composition, there have been some difficulties faced. Initially, based on the results of the analysis obtained from the liquid chromatographer, an attempt was made to create calibration equations (models) to determine the composition of sixteen amino acids of proteins for beans and green mass of soybean and wheat, primarily for NIR 4250 IR-scanner (Pacific Scientific, USA), and later for FOSS NIRSistem 5000 scanner (FOOS Analytical A/S, Denmark). The calibration equations were created separately for a group of essential amino acids, namely Arginine; Histidine; Isoleucine; Leucine; lysine; Methionine; Phenylalanine; Threonine; Valine - total nine components (amino acids) in the equation, and separately for the group of nonessential amino acids: Alanine; Aspartic acid; Cysteine; Glutamic acid; Proline; Serine; Tyrosine - total seven components (amino acids) in the equation. Difficulties in creating these equations were that if analyzing chromatograms and calculating peak areas, it is established that the sum of the areas of all peaks on the chromatogram is 100% and the proportion of each amino acid is calculated from a simple ratio, but when creating a calibration equation based on these data, a computer program that controls the operation of the scanner, gives the result for each amino acid, but it does not take into account the fact that the sum of all amino acids must be equal to 100%. It can come to a serious problem when it comes to combining data on essential and non-essential amino acids, that is, to describe the complete amino acid composition of the sample. For different samples this may be less and more than 100%, that causes difficulties in data analysis.

To overcome this drawback, a series of chromatographic determinations of the individual amino acids content in mass fractions (g/100 g of protein) was carried out. For this purpose, chemically pure preparations of amino acids and the “additive technique” were used. A known quantity of one or another amino acid was added to the prepared samples for chromatographic separation and the mass fraction was determined by the proportionally increase in the peak area. Data on the mass fraction of each amino acid was used to create calibration equations. At the same time, the mass fraction of all amino acids determined on the scanner does not exceed 90 g/100 g of protein, which is quite consistent in the literature data, taking into account the loss of some amino acids during hydrolysis [2, 6].


Comparative data on the amino acid composition of proteins of soybean grain, obtained through liquid chromatography and on the IR-scanner with the use of different types of calibration curves, are given in Table 1.


Table 1: Thw results of determining the amino acid composition of proteins of soya beans on the liquid chromatographer and FOSS NIRSistem 5000 IR-scanner according to different calibration equations

Amino acid

Liquid chromatographer



g/100 g of protein

equation 1 (%)

equation 2

(g/100 g of protein)

sample 1

sample 2

sample 3

sample 4

sample 5

Essential amino acids























































The amount of essential






Nonessential amino acids

Alanine + glycine






Aspartic acid






Aspartic acid






























Amount of nonessentia






Sum of all amino acids







When calculating the peak areas in the chromatograms and establishing the percentage ratio of all the amino acids that left the column, the sum is 100% in total that accordingly makes it possible to judge the contribution of each of them to the characteristics of soybean proteins. For example, the content of the essential amino acid Lisine in the amount of 6, 32% of the total number of amino acids characterizes soybean proteins as a high-lysine product, which increases the value of species in feed mixtures for farm animals (Statutory Act “On the use of biologically active substances (BAS) and the norms of their introduction in the mixed fodder”, approved by the Ministry of Agriculture of Russia on April 30th 1997).

When determining the amino acid composition on the IR-scanner through the calibration equation 1, based on the data of percentage ratio, the sum of amino acids have been determined, more than 100% or less. This result is not entirely correct and requires certain corrections [14]. In this case, there is a large variation in the data for some amino acids. For example, the content of the essential amino acid Gistidine ranges from 6,1 to 10,9%, and Isoleucine - from 4,31 to 6,12. All this reduces the effectiveness of the provided method.

When using the calibration equation 2, based on the determination of the mass fraction of amino acids, it has been found that the amino acids are not more than 90 g in 100 g of protein in soybean, which is quite correct and corresponds to literature data, taking into account the fact that some of the amino acids are destroyed during sample preparation.


Thus, in the course of research it was found that when determining the amino acid composition of soybean proteins and other plant objects with the use of IR-scanners, the calibration equations should be created based on the mass fraction data of individual amino acids, and not on their percentage ratio. The calibration equations which are calculated by the mass fraction of individual amino acids allow to obtain more correct data. Moreover, the use of IR scanners increases the productivity of laboratory research which is especially important in conducting massive analysis. In this respect, calibration equation for IR scanners based on data of percentage ration calculates the sum of amino acids different from 100 % - which is not considered correct. On the opposite, calibration equations in which the data of absolute content of amino acids have been used, give as a result a bit more then ninety grams of analyzed amino acids for 100 grams of protein. Therefore, this method is considered to be more effective. With the use of this method, there are only seventeen out of twenty amino acids included into the protein content being taking into account. That is why their total content does not increase 100 grams which corresponds to the literature. On the other side, the correlation between individual amino acids in units of mass is the same as their percentage ratio which allows to conduct the correct analysis of protein full value.


  1. Melnikov I.O., Development of micromethods for the analysis of amino acids, short peptides and oligonucleotides with the use of RP-HPLC (Reversed Phase-High-performance liquid chromatography) and capillary electrophoresis. Author`s Dissertation of the Candidate of Chemical Sciences, 2006.
  2. Infrared Gas Analyzer Infrared Scanner-105., The maintenance manual MM 4434-001-67508564-2010. Saint Petersburg; 2010.
  3. Bayunov A.P., Smarygin S.N, The use of model systems for obtaining calibrations in the near infrared spectroscopy technique, Herald of Kazan Technological University, 2011; 11: 23-29.
  4. Mouazen A.M., Saeys W., Xing J., DeBaerdemaeker J. and Ramon H., Near infrared spectroscopy for agricultural materials: an instrument comparison J. Near Infrared Spectrosc., 2005,13: 87-98.
  5. Sinegovskaya V.T., Kletkina O.O. History of the development of agrarian science in Priamurye. Odeon. Blagoveshchensk; 2018.
  6. Novikov N.N. Biochemistry of plants. Kolos. Moscow; 2013.
  7. Sinegovskaya V.T, The state and prospects of scientific support for soybean production, Far East Agrarian Bulletin, 2016, 4 (40): 8-12.
  8. Efimenko S.G., Efimenko S.K., Kucherenko L.A., Nagalevskaya Ya.A, Express-evaluation of the content of basic fatty acids in the rape seeds oil through IR-spectrometry, Scientific and Technical Bulletin of the All-Russian Scientific Research Institute of Oilseeds, 2015, 4 (164): 24-39.
  9. Shimizu N., Evaluating techniques for rice grain quality using near infrared transmission spectroscopy, J. Near Infrared Spectrosc., 2008, 6(A): 111-116.
  10. Vasilev A.V., Grinenko E.V., Schukin A.O., Fedulina T.G. Infrored Spectroscopy of Organic Nature Compositions: Study guide. SPSLTA, Saint Petersburg; 2007.
  11. Williams P. and Norris K. Near-Infrared Technology in the Agricultural and Food Industry. American Association of Cereal Chemists. Inc. St. Paul. Minnesota, USA; 2007.
  12. Williams P.C., Corderio H.M., Effect of calibration practice on correlation of errors induced in near-infrared protein testing of hard red spring wheat growing location and season, J. Agric. Sci., 2005,104:113-123.
  13. Watson C.A., Carville D., Dikermane E., Daigger G., Booth G.D., Evaluation of two infrared instruments for determination protein content of hard red winter wheat, J. Cereal Chem., 2006, 53: 214-222.
  14. Sekanov Yu.P., Stepanov M.A., Scientific foundations and experience in the use of means for non-destructive quality control of products and technological processes in crop production, Bulletin of the All-Russian Scientific Research Institute of Livestock Mechanization, 2016, 4 (24): 110-115.