Predictive analytics with big data-spark framework Article

Upadhyay, H, Lagos, L, Joshi, S et al. (2018). Predictive analytics with big data-spark framework . 36(2), 29-31.

cited authors

  • Upadhyay, H; Lagos, L; Joshi, S; Esoofally, M; Cooper, K


  • The system will extract and analyze meaningful information from the data generated by the various sensors on the Nuclear Power Plant equipment and process them using the big data Spark [1] framework in near real time using In-Memory analytics and traditional machine learning algorithms. The continuous stream of data generated by temperature sensors, pressure sensors, sensors that interact with the plants physical equipment and other real-time sources are compelling nuclear plant owners to imagine what they could do with this huge amount of data. The term "big data" applies here because the data is humongous in terms of volume (data size), velocity (speed of change) and variety (different forms of data). As more and more data is generated and collected, data analysis requires scalable, fl exible, and high performance tools to provide insights really quick. Thus integrating a big data platform into the nuclear power plant ecosystem is a one stop solution. This article focuses on proposing a design for building an open source predictive analytics system using Apache Spark. Real-time analytics coupled with machine learning can keep the plant owners up-todate on the current performance and of potential risks that will arise by analyzing patterns in the data. The models used for machine learning can be built from larger volume of historical / real time nuclear plant sensor data. The trained model can be used to predict variations in the power plant equipment performance and maintenance.

publication date

  • March 1, 2018

start page

  • 29

end page

  • 31


  • 36


  • 2