ALL ARTICLES FOR Machine learning

 

Increasing yields has been a key goal for farmers since the dawn of agriculture. People have continually looked for ways to maximise food production from the land available to them. Until recently, land management techniques such as the use of fertilisers have been the primary tool for achieving this.

 

Challenges for Farmers

 

Whilst these techniques give a much improved chance of an increased yield, problems beyond the control of farmers have an enormous impact:

 

  1. Parasites - “rogue” plants growing amongst the crops may hinder growth; animals may destroy mature plants
  2. Weather - drought will prevent crops from flourishing, whilst heavy rain or prolonged periods of cold can be devastating for an entire season
  3. Human error - ramblers may trample on crops inadvertently, or farm workers may make mistakes
  4. Chance - sometimes it’s just the luck of the draw!

 

AI techniques can be used to reduce the element of randomness in farming. Identification of crop condition and the classification of likely causes of poor plant condition would allow remedial action to be taken earlier in the life cycle. This can also help prevent similar circumstances arising the following season.

 

Computer Vision to the Rescue

 

Computer vision is the most appropriate candidate technology for such systems. Images or video streams taken from fields could be fed into computer vision pipelines in order to detect features of interest.

 

          

 

A key issue in the development of computer vision systems is the availability of data; a potentially large number of images are required to train models. Ideal image datasets are often not available for public use; this is certainly the case in an agricultural context. Nor is the acquisition of such data a trivial exercise. Sample data is required over the entire life cycle of the plants - it takes many months for the plants to grow, and given the potential variation in environmental conditions, it could take years to gather a suitable dataset.

 

How Synthetic Data Can Help

 

The use of synthetic data offers a solution to this problem. The replication of nature synthetically poses a significant problem: the element of randomness. No two plants develop in the same way. The speed of growth, age, number and dimensions of plant features, and external factors such as sunlight, wind and precipitation all have an impact on the plant’s appearance.

 

Plant development can be modelled by the creation of L-systems for specific plants. These mathematical models can be implemented in tools such as Houdini. The Digica team used this approach to create randomised models of wheat plants.

 

                    

 

The L-system we developed allowed many aspects of the wheat plants to be randomised, including height, stem segment length and leaf location and orientation. The effects of gravity were applied randomly and different textures were applied to modify plant colouration. The Houdini environment is scriptable using Python; this allows us to easily generate a very large number of synthetic wheat plants to allow the modelling of entire fields.

 

The synthetic data is now suitable for training computer vision models for the detection of healthy wheat, enabling applications such as:

 

  • filtering wheat from other plants
  • identifying damaged wheat
  • locating stunted and unhealthy wheat
  • calculation of biomass
  • assessing maturity of wheat

 

With the planet’s food needs projected to grow by 50% by 2050, radical solutions are required. AI systems will provide a solution to many of these problems; the use of synthetic data is fundamental to successful deployments.

 

Digica’s team includes experts in the generation and use of synthetic data; we have worked with it in a variety of applications since our inception 5 years ago. We never imagined that it could be used in such complex, rich environments as agriculture. It seems that there are no limits for the use of synthetic data in the Machine Learning process! 

 

Data is all around us, and we don't even see it.

 

 

Data Scientists usually work on projects related to well known topics in Data Science and Machine Learning, for example, projects that rely on Computer Vision, Natural Language Processing (NLP) and Preventive Maintenance. However, in Digica, we're working on a few projects that do not really focus on processing either visual data, text or numbers. In fact, these unusual projects focus on types of data that are flowing around us all the time, but this data nevertheless remains invisible to us because we cannot see it.

 

       1. WiFi           

wifi router gbe15093f9 1920

WiFi technology generates a lot of waves around us. And this data can convey more information than you think. Having just a WiFi router and some mobile devices in a room is enough for us to easily detect what is happening in the room. With this technique, movement distorts the waves in such a way that we can then easily detect that movement, for example, if someone raises a hand. 

 

The nature of WiFi itself makes it pretty easy to set up this technique. Firstly, as mentioned above, we don't even need to use any extra instruments or tools, such as cameras. It's enough to have just a router and some mobile devices. And secondly, this technique can even work through the walls. So this means we can easily use it throughout the whole house, and without thinking about cables or adding any extra equipment to each room.


Some articles have already been published on human gesture recognition using this type of wave, for example, this article

 In that and other articles, you can read about how the algorithm can generate a pretty detailed picture. For example, the algorithm can recognize a person's limbs one by one, and then construct a 3D skeleton of that person. In this way, it is possible to reproduce many elements of a person's position and gestures. It’s actually a really cool effect as long as a stranger is not looking at someone else's data, which would be quite creepy! 

 

         2. Microwaves

 

 microwave gf33c4f3f5 1280

 

I'm sure that you have used microwaves to heat up a meal or cook food from scratch. And you may also be familiar with the idea of medical breast imaging. However, you might not know that those two topics use the same technology, but in different ways. 

 

It turns out that, after sending a microwave at breasts, the waves that are reflected back from healthy tissue looks different from the waves that are reflected from malignant tissue. "So what", you may say, "we already have mammography for that." Yes, but mammograms give a higher exposure to radiation. And it is really difficult to distinguish healthy tissue from malignant tissue in dense breast mammogram images, as described in this link. Microwaves were first studied in 1886, but, as you see, they are now being put to new uses, such as showing up malignant tissue in a way that is completely non-invasive and harmless to people.

 

By the way, microwaves are also perfect for weather forecasting. This is because water droplets scatter microwaves, and using this concept helps us to recognize clouds in the sky!

 

            3. CO2

 

Last but not least, we have Carbon Dioxide. This chemical compound is actually a great transmitter of knowledge. Did you know that CO2 can very accurately indicate the number of people in a room? Well, it does make sense because we generate CO2 all the time as a result of breathing. However, it’s not that obvious that we can be around 88% accurate in indicating the number of people in a given room! 

 

When this approach is set up, we can seamlessly detect, for example, that a room is unoccupied, and therefore it would be a good idea to save money by switching off all the electronics in that room. So this can be a great add-on to every smart home or office.

 

You might think that the simplest way to find out if a given room is unoccupied is to employ hardware specifically for this purpose, such as cameras and RFID tags. However, such a high-tech approach entails additional costs and, most importantly these days, carries the risk of breaching the privacy of people. On the other hand, as described above, the data is already there, and just needs to be found and utilized to achieve the required.

 

In the simplest case, we just read the levels of CO2 gas in a room, and plot those levels against time. Sometimes, for this task, we can also track the temperature of the room, as in thisexperiment. However, note that temperature data is often already available, for example, in air-conditioning systems. We only need to read the existing data, and then analyse that data correctly in order to provide the insight that is required in the particular project.



There are many, many more types of data that are invisible to the human eye, but offer an amazing playground for Data Scientists. For example, there are radio waves, which are a type of electromagnetic wave which have a wavelength that is longer than microwaves. And there are infrared waves, which are similar to radio waves but are shorter than microwaves, and are great for thermal imaging. And then there are sound waves, which we can use for echo-location (like bats). The above waves were the first ones that came to mind, but I'm sure that there are many other sources of invisible data that can be re-used for the purposes of Data Science.

 

 

Since it’s 2021, it’s probably no surprise to you that heart rate can be measured using different gadgets like smartphones or smartwatches.

 

For some reason, it is quite natural for people to argue with each other all the time. Wives argue with husbands. Children argue with their parents. Facebook users argue with other Facebook users. United fans argue with City fans. And it goes without saying that … Data Scientists argue with other Data Scientists!

 

 

Nowadays, no one needs to be convinced of the power and usefulness of deep neural networks. AI solutions based on neural networks have revolutionised almost every area of ​​technology, business, medicine, science and military applications. After the breakthrough win of Geoffrey Hinton's group in the ImageNet competition in 2012, neural networks have become the most popular machine learning algorithm. Since then, 21st century technology has come to increasingly rely on AI applications. We encounter AI solutions in almost every step of our daily lives - in cutting-age technologies, entertainment systems, business solutions, protective systems, the medical domain and many more areas. In many of these areas, AI solutions work in a way which is self-sufficient and under only little or no human supervision.

 

According to some sources, over 40% of all Internet traffic is made up of bot traffic. And we know that malicious bots are a significant proportion of current bot traffic. This article describes a number of strategies (Machine Learning, user authentication using simple input devices, and behavioral biometrics) which you can use to distinguish automatically between humans using the Internet (on the one hand), and bots (on the other hand).

 

The main focus in machine learning projects is to optimize metrics  like accuracy, precision, recall, etc. We put effort into hyper-parameter tuning or designing good data pre-processing. What if these efforts don’t seem to work? 

If I was to point out one most common mistake of a rookie Data Scientist, it’s their focus on the model, not on the data.

How can we help you?

 

To find out more about Digica, or to discuss how we may be of service to you, please get in touch.

Contact us