An-Najah National University

Department of Computer Engineering

Graduation Project II

ETHER: An Adaptive Solar Tracking System

Mohammad Raed

Mohammad Matar

Supervisor: Dr. Suleiman Abu Kharmeh

A report submitted in partial fulfilment of the requirements of
An-Najah National University for the degree of
Bachelor of Science in Computer Engineering

January 27, 2026


Abstract

The growing global demand for renewable energy has emphasized the critical need for maxi-
mizing solar energy system efficiency. This project presents the design and implementation of
an intelligent dual-axis solar tracking system that integrates embedded control, wireless con-
nectivity, and FPGA-based hardware acceleration to enhance tracking precision and system
responsiveness.

The system architecture comprises three main components working in coordination: an
Arduino microcontroller implementing a Proportional-Integral-Derivative (PID) controller to
drive servo motors for precise panel orientation, an ESP32 module providing wireless connectiv-
ity for system monitoring and dashboard applications, and a DE1-SoC FPGA board executing
a Kalman filter algorithm for predictive control. The Arduino calculates solar position using
built-in algorithms and communicates with the FPGA via UART protocol to transmit sensor
data and receive filtered position estimates. The FPGA implementation utilizes DSP slices to
accelerate the mathematical computations of the Kalman filter, providing real-time predictive
adjustments that compensate for measurement delays and noise in the control system.

The developed system successfully demonstrates improved tracking accuracy through the
integration of predictive filtering with embedded PID control. The FPGA-accelerated Kalman
filter effectively reduces positioning errors and system oscillations, while the wireless moni-
toring capability enables real-time performance assessment. Environmental sensors provide
contextual data for comprehensive system evaluation.

This work demonstrates how hardware-software co-design enhances renewable energy sys-
tems through improved precision, adaptability, and computational efficiency. The integration
of predictive control with FPGA acceleration represents a novel approach to solar tracking
that advances beyond traditional microcontroller-only solutions, offering significant potential
for both educational applications and small-scale renewable energy implementations.

Keywords: FPGA, SoC, Kalman Filter, PID Control, Solar Tracking, Embedded Sys-
tems, Renewable Energy, Internet of Things, DE1-SoC, Arduino, ESP32

The full project repository can be found at:

• https://github.com/mo-matar/Heterogeneous-Solar-Tracking-System

i

https://github.com/mo-matar/Heterogeneous-Solar-Tracking-System


Acknowledgements

We would like to express our sincere thanks to our supervisor, Dr. Suleiman Abu Kharmeh, for
providing the DE1-SoC board and supporting us throughout this project. His continued guid-
ance, professional feedback, and effort helped us improve our work and overcome challenges
during development.

ii


Contents

List of Figures v

List of Tables vii

1 Introduction 1
1.1 General Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objectives and Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Significance and Importance . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Summary of Contributions and Achievements . . . . . . . . . . . . . . . . . 3
1.5 Organization of the Report . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Literature Review 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Existing Microcontroller Systems and Their Limitations . . . . . . . . . . . . 5
2.3 Kalman Filtering for Optimal State Estimation . . . . . . . . . . . . . . . . . 6
2.4 FPGAs for Hardware-Accelerated Filtering . . . . . . . . . . . . . . . . . . . 7
2.5 Research Gap and Opportunity . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.6 Proposed Heterogeneous Architecture . . . . . . . . . . . . . . . . . . . . . 8
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Methodology 11
3.1 Overall System Architecture and Block Diagram . . . . . . . . . . . . . . . . 11
3.2 Hardware Subsystem Design . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2.1 Sensor and Actuation Layer (Arduino Domain) . . . . . . . . . . . . 13
3.2.2 Computation Layer (DE1-SoC FPGA Domain) . . . . . . . . . . . . . 15

3.3 Kalman Filter Hardware Implementation . . . . . . . . . . . . . . . . . . . . 21
3.3.1 Algorithm Selection and Adaptation . . . . . . . . . . . . . . . . . . 21
3.3.2 Fixed-Point Numerical Representation . . . . . . . . . . . . . . . . . 22
3.3.3 Hardware Architecture and Parallelization . . . . . . . . . . . . . . . 23
3.3.4 Adaptive Noise Rejection Strategy . . . . . . . . . . . . . . . . . . . 28

3.4 System Integration and Co-Design . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.1 Hardware-Software Interface (HPS-FPGA) . . . . . . . . . . . . . . . 29
3.4.2 Inter-Processor Communication Protocols . . . . . . . . . . . . . . . 29

3.5 Validation and Testing Methodology . . . . . . . . . . . . . . . . . . . . . . 31
3.5.1 Unit-Level Verification . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.2 Integration Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

iii


CONTENTS iv

4 Results 34
4.1 Hardware Implementation and Synthesis Results . . . . . . . . . . . . . . . . 34

4.1.1 Resource Utilization and Performance . . . . . . . . . . . . . . . . . 34
4.1.2 RTL Synthesis Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2 Kalman Filter Algorithm Performance . . . . . . . . . . . . . . . . . . . . . 36
4.2.1 Baseline Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.2 Robustness Under Severe Noise . . . . . . . . . . . . . . . . . . . . . 37
4.2.3 Spike Rejection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.4 Sudden Change Response . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.5 Performance Summary . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.3 Integrated System Demonstration . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.1 Real-Time Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.2 Physical Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5 Discussion 46
5.1 Interpretation of Key Findings . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.1.1 Achieving the Performance Target . . . . . . . . . . . . . . . . . . . 46
5.1.2 The Heterogeneous Architecture Validated . . . . . . . . . . . . . . . 47

5.2 Comparative Analysis with Prior Work . . . . . . . . . . . . . . . . . . . . . 47
5.2.1 Advancement Over Simple Arduino-Based Trackers . . . . . . . . . . 47
5.2.2 Differentiation from Advanced MCU-Only IoT Trackers . . . . . . . . 48
5.2.3 Bridging Research Implementations to Practical Systems . . . . . . . 49

5.3 Implications of Hardware Design Choices . . . . . . . . . . . . . . . . . . . . 49
5.3.1 FPGA Resource Efficiency as Strategic Advantage . . . . . . . . . . . 49
5.3.2 Fixed-Point Arithmetic Sufficiency Validated . . . . . . . . . . . . . . 50

5.4 Limitations and Practical Considerations . . . . . . . . . . . . . . . . . . . . 51
5.4.1 Latency Characterization . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4.2 Environmental Validation Scope . . . . . . . . . . . . . . . . . . . . 51

6 Conclusion and Recommendations 53
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2 Future Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.2.1 Intelligent, Runtime-Adaptive Filtering via HPS Supervisory Control . 54
6.2.2 Machine Learning for Parameter and State Prediction . . . . . . . . . 54
6.2.3 Advanced Power Management via Low-Power FPGA Techniques . . . 54
6.2.4 Hybrid Predictive Tracking with Sun Ephemeris Fusion . . . . . . . . 54
6.2.5 Enhanced Dashboard with Proactive Analytics . . . . . . . . . . . . . 55

References 56


List of Figures

3.1 System-level block diagram showing the heterogeneous data flow between
Arduino (sensor/actuation layer), DE1-SoC FPGA (computation layer), and
ESP32 (communication layer) . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 Arduino Mega connection diagram showing LDR sensor interfacing, servo mo-
tor control, and UART communication links to FPGA and ESP32 . . . . . . 14

3.3 Cyclone V SoC top-level block diagram showing the FPGA fabric, HPS sub-
system, and their interconnections via high-performance AXI bridges . . . . . 17

3.4 Platform Designer (Qsys) system interconnection diagram showing the HPS,
UART controller, PIO ports, and custom Avalon bridge . . . . . . . . . . . . 19

3.5 Platform Designer address map showing base addresses and spans for all
memory-mapped peripherals in the FPGA fabric . . . . . . . . . . . . . . . . 20

3.6 Kalman filter top-level RTL block diagram showing dual independent filter
instances and external interface signals . . . . . . . . . . . . . . . . . . . . . 23

3.7 Kalman filter finite state machine showing the sequential control flow through
predict, update, and output phases . . . . . . . . . . . . . . . . . . . . . . . 24

3.8 External bus to Avalon bridge timing diagram showing the standard Avalon-
MM read and write cycles (adapted from Intel University Program IP docu-
mentation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.9 Kalman-to-Avalon bridge state machine showing UART polling, data assembly,
filter triggering, and result transmission phases . . . . . . . . . . . . . . . . . 27

4.1 FPGA resource utilization summary . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Top-level synthesis hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Kalman filter top module synthesis showing dual independent filter cores . . . 36
4.4 Tracking performance for trajectory with raw vs filtered angles . . . . . . . . 36
4.5 Normal trajectory tracking with 2° Gaussian noise . . . . . . . . . . . . . . . 37
4.6 Performance with 4° noise plus random large spikes (3% occurrence rate) . . 37
4.7 Trajectory tracking with deliberate sharp spikes (20-45°) . . . . . . . . . . . 37
4.8 Close-up showing spike attenuation . . . . . . . . . . . . . . . . . . . . . . . 38
4.9 Response to legitimate sudden position changes (six abrupt jumps 20-60°) . . 38
4.10 Close-up showing convergence after sudden change without overshoot . . . . 39
4.11 Error distribution histograms for all scenarios . . . . . . . . . . . . . . . . . 39
4.12 RMSE comparison and noise reduction across all scenarios . . . . . . . . . . 40
4.13 ESP32 dashboard showing real-time raw vs filtered angles with FPGA status . 41
4.14 Dashboard displaying LDR readings, power monitoring, and system statistics 42
4.15 Manual control interface with slider-based servo positioning . . . . . . . . . . 43
4.16 Real-time scrolling plots showing raw (red) vs filtered (green) trajectories . . 44
4.17 Fully assembled prototype with dual servo motors, LDR array, and integrated

electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

v


LIST OF FIGURES vi

4.18 Fully assembled prototype front view showing solar panel, LDRs, and wiring . 45


List of Tables

4.1 FPGA resource utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Kalman filter performance summary . . . . . . . . . . . . . . . . . . . . . . 40

vii


List of Abbreviations

FPGA Field-Programmable Gate Array

HPS Hard Processor System

LDR Light Dependent Resistor

AXI Advanced eXtensible Interface

RMSE Root Mean Square Error

viii


Chapter 1

Introduction

1.1 General Background
The global transition toward renewable energy sources has positioned solar photovoltaic (PV)
technology as a cornerstone of sustainable energy infrastructure. As governments worldwide
implement ambitious carbon reduction targets and renewable energy mandates, the optimiza-
tion of solar energy capture has become increasingly critical. Solar panels operating in fixed
positions can only harvest a fraction of available solar energy throughout the day, as the sun’s
position continuously changes due to Earth’s rotation and orbital mechanics.

Solar tracking systems address this fundamental limitation by dynamically orienting pho-
tovoltaic panels to maintain optimal alignment with the sun’s position. Research has demon-
strated that dual-axis solar tracking can increase energy capture compared to fixed installa-
tions, making it an attractive solution for maximizing return on investment in solar infrastruc-
ture. Traditional solar tracking systems typically employ either passive tracking mechanisms or
basic microcontroller-based control systems that rely solely on sensor feedback or astronomical
calculations.

The emergence of embedded systems and field-programmable gate arrays (FPGAs) has
opened new possibilities for intelligent control systems that can enhance tracking precision
through predictive algorithms. Modern solar tracking applications demand not only accurate
positioning but also smooth operation, minimal power consumption, and robust performance
under varying environmental conditions. The integration of advanced filtering techniques,
such as Kalman filters, with hardware acceleration presents an opportunity to achieve superior
tracking performance while maintaining computational efficiency.

Wireless connectivity and Internet of Things (IoT) technologies have further transformed
the landscape of solar energy systems, enabling remote monitoring, predictive maintenance,
and real-time performance optimization. The convergence of embedded control, hardware
acceleration, and wireless monitoring creates the foundation for next-generation intelligent
solar tracking systems that can adapt to changing conditions and provide comprehensive
operational insights.

1.2 Objectives and Purpose
The primary objective of this project is to design and implement an intelligent dual-axis solar
tracking system that integrates embedded control, hardware-accelerated predictive filtering,
and wireless monitoring to achieve enhanced tracking precision and system performance. The
system aims to demonstrate how hardware-software co-design can improve upon traditional

1


CHAPTER 1. INTRODUCTION 2

solar tracking approaches through the strategic combination of different computing platforms.
The specific objectives of this project include:

1. Embedded Control Development: Design and implement a microcontroller-based
control system using Arduino that employs Proportional-Integral-Derivative (PID) con-
trol algorithms to drive dual-axis servo motors for precise solar panel orientation.

2. Hardware-Accelerated Filtering: Develop and deploy a Kalman filter algorithm on
FPGA fabric to provide predictive position estimation that compensates for measurement
delays, sensor noise, and environmental disturbances.

3. System Integration: Establish robust communication protocols between the Arduino
controller and FPGA processing unit to enable real-time data exchange and coordinated
control actions.

4. Wireless Monitoring: Implement a comprehensive monitoring and visualization system
using ESP32 technology that provides real-time system status, performance metrics, and
remote control capabilities.

5. Performance Evaluation: Conduct thorough testing and analysis to quantify the im-
provements in tracking accuracy, stability, and overall system performance achieved
through the integrated approach.

The project serves both educational and practical purposes, demonstrating advanced em-
bedded systems concepts while creating a functional renewable energy solution suitable for
small-scale applications. The modular design approach ensures that individual components can
be analyzed and optimized independently while contributing to overall system performance.

1.3 Significance and Importance
This project addresses several critical challenges in modern solar tracking systems while ad-
vancing the state of the art in embedded renewable energy control. The significance of this
work extends across multiple domains, from renewable energy optimization to embedded sys-
tems engineering and hardware-software co-design methodologies.

Renewable Energy Impact: The integration of predictive control algorithms represents
a significant advancement over traditional reactive tracking systems. By anticipating solar
position changes and compensating for system delays, the proposed approach can achieve
smoother tracking motion and reduced mechanical wear, ultimately leading to improved energy
capture efficiency and longer system lifespan.

Embedded Systems Innovation: The project demonstrates the practical application of
heterogeneous computing in embedded systems, where different processing platforms (mi-
crocontroller, FPGA, and wireless module) collaborate to achieve superior performance. This
approach showcases how modern embedded system designers can leverage the unique strengths
of different computing architectures.

Hardware Acceleration Benefits: The FPGA implementation of the Kalman filter pro-
vides computational acceleration while maintaining real-time performance requirements. This
approach demonstrates how complex mathematical algorithms can be efficiently implemented
in hardware to achieve deterministic timing and reduced computational load on the primary
controller.

Educational Value: The project serves as an excellent educational platform for under-
standing advanced control theory, digital signal processing, embedded systems design, and


CHAPTER 1. INTRODUCTION 3

renewable energy systems. The modular architecture allows students and researchers to ex-
amine individual components while understanding their integration within a complete system.

Scalability and Adaptability: The design principles and architectural approaches devel-
oped in this project can be scaled and adapted for larger solar installations and other tracking
applications, making the research relevant for both small-scale and commercial implementa-
tions.

Cost-Effectiveness: By utilizing widely available development platforms and demonstrat-
ing efficient resource utilization, the project provides insights into developing cost-effective
intelligent tracking solutions that can compete with commercial alternatives.

1.4 Summary of Contributions and Achievements
This project makes several novel contributions to the field of intelligent solar tracking systems
and embedded control:

System Architecture Contributions:

• Development of a heterogeneous embedded system architecture that combines Arduino-
based PID control, FPGA-accelerated Kalman filtering, and ESP32-based wireless mon-
itoring

• Design of efficient communication protocols for real-time data exchange between differ-
ent computing platforms

• Implementation of a modular system design that enables independent optimization of
control, filtering, and monitoring subsystems

Control System Innovations:

• Integration of predictive Kalman filtering with traditional PID control to achieve en-
hanced tracking smoothness and accuracy

• Development of adaptive control parameters that respond to varying environmental
conditions and system dynamics

• Implementation of intelligent sensor fusion that combines light-dependent resistor (LDR)
measurements with predictive algorithms

Hardware Implementation Achievements:

• Successful deployment of a scalar Kalman filter algorithm on FPGA hardware using
fixed-point arithmetic optimization

• Achievement of real-time performance requirements while maintaining computational
accuracy

• Demonstration of effective resource utilization on the DE1-SoC development platform

System Integration Accomplishments:

• Creation of a comprehensive monitoring and visualization interface that provides real-
time system insights

• Development of robust error handling and fault tolerance mechanisms


CHAPTER 1. INTRODUCTION 4

• Implementation of performance tracking and statistical analysis capabilities for system
optimization

The completed system successfully demonstrates improved tracking performance compared
to traditional approaches while maintaining cost-effectiveness and educational accessibility.
The project provides a foundation for future research in intelligent renewable energy systems
and hardware-software co-design methodologies.

1.5 Organization of the Report
This report is organized into six main chapters, each addressing specific aspects of the solar
tracking system design, implementation, and evaluation:

Chapter 1: Introduction provides the foundational context for the project, including
background information on solar tracking technology, project objectives, significance of the
research, and key contributions.

Chapter 2: Literature Review examines existing solar tracking systems, control algo-
rithms, Kalman filtering approaches, and FPGA-based acceleration. This chapter establishes
the theoretical foundation and motivates the heterogeneous architecture adopted in this work.

Chapter 3: Methodology presents the system architecture and co-design methodology,
detailing hardware platform roles (Arduino, DE1-SoC FPGA, ESP32), communication pro-
tocols, and the fixed-point adaptive Kalman filter hardware implementation and verification
approach.

Chapter 4: Results reports FPGA synthesis/resource utilization outcomes, Kalman filter
simulation performance across test scenarios, and integrated system demonstration results
including real-time dashboard visualization.

Chapter 5: Discussion interprets the key findings, compares the proposed approach
against prior work, analyzes implications of major hardware/software design choices, and doc-
uments practical limitations and deployment considerations.

Chapter 6: Conclusion and Recommendations summarizes the project’s achievements
and contributions, and provides concrete recommendations for future extensions such as
runtime-adaptive filtering, predictive models, power management, and enhanced monitoring
analytics.

Additional supporting materials, including detailed code listings, hardware specifications,
and supplementary analysis, are provided in the appendices to ensure reproducibility and
facilitate further research.


Chapter 2

Literature Review

2.1 Introduction
Global energy demand continues to rise, with renewable sources becoming increasingly critical.
Photovoltaic (PV) systems offer a promising solution, but their efficiency depends heavily
on proper orientation toward the sun. Dual-axis solar trackers address this by dynamically
adjusting panel orientation, with research showing they can generate 31.4% more energy than
single-axis trackers and 67.9% more than fixed panels Amadi and Gutiérrez (2019).

The evolution of solar tracking systems mirrors trends in embedded systems design.
Early implementations used simple time-based controllers or basic light sensors. The Ar-
duino platform brought accessible microcontroller-based systems using Light Dependent Re-
sistors (LDRs) and straightforward control logic Mohanapriya et al. (2021). More recently,
IoT-capable systems have emerged, with Mustafa et al.’s ESP32-based tracker incorporating
software-based Kalman filtering, PID control, and web dashboards, achieving approximately
43% energy gain over fixed panels MUSTAFA (2024).

However, a significant gap persists in the literature. Accessible end-user systems prioritize
simplicity and cost but sacrifice performance due to computational limitations. Research-
focused implementations demonstrate advanced algorithms on high-performance hardware
but remain impractical for deployment. Even systems incorporating FPGAs, like BaBars et
al.’s work, use the FPGA as a monolithic controller rather than a specialized co-processor
BaBars et al. (2025).

This review argues that a heterogeneous architecture—strategically combining Arduino for
sensor acquisition, FPGA for hardware-accelerated signal processing, and ESP32 for wireless
connectivity—represents the optimal solution. By leveraging each component’s strengths,
such a system can deliver research-grade filtering performance within a practical, deployable
platform, addressing the fundamental trade-off between accessible simplicity and computa-
tional sophistication.

2.2 Existing Microcontroller Systems and Their Limitations
The Arduino platform has become the standard for entry-level solar tracking projects due
to its accessibility and straightforward programming model. A typical implementation uses
four LDR sensors in a quadrant configuration, with the Arduino processing analog inputs,
calculating positional errors, and driving motors to align the panel with the sun Mohanapriya
et al. (2021). This design is economical and intuitive—essentially comparing sensor readings
and moving toward brighter light.

5


CHAPTER 2. LITERATURE REVIEW 6

However, the Arduino’s 16 MHz ATmega328P with limited memory (32 KB flash, 2 KB
RAM) restricts algorithm sophistication. When LDR sensors provide noisy data—common
with intermittent cloud cover or reflections—the Arduino cannot implement advanced filtering
without compromising control loop responsiveness.

The ESP32 has enabled a new generation of "smart" trackers with wireless connectivity and
monitoring interfaces. Mustafa et al.’s system integrates LDR sensing, software-based Kalman
filtering, PID control, and a web dashboard—all on a single ESP32 MUSTAFA (2024). While
the dual-core 240 MHz processor is considerably more capable, it must time-slice between
multiple demanding tasks: sensor acquisition, Kalman filter equations, PID control, WiFi
communication, and web serving.

This creates fundamental limitations. First, complex algorithms like Kalman filters re-
quire multiple floating-point operations per update cycle, executed serially on the sequential
processor. Research by Linares-Barranco et al. shows CPU-based filtering exhibited 8-24 ms
latencies versus 4-4.2 ms on FPGAs—a 188-570% reduction Linares-Barranco et al. (2019).
Second, LDR sensors produce inherently noisy analog signals, and microcontroller-based sys-
tems typically use simple techniques like moving averages that lack the mathematical rigor of
optimal estimation. Third, designers face a zero-sum trade-off: adding features or improving
one aspect necessarily degrades another. Implementing a more complex filter reduces control
loop update rates. Serving a feature-rich dashboard consumes memory and processor time
needed for accurate control.

These limitations manifest as reduced tracking accuracy during challenging conditions,
slower response to changing light, and system instability when subsystems compete for re-
sources. The question becomes: how to transcend these limitations without abandoning the
cost-effectiveness and accessibility that make microcontroller systems attractive?

2.3 Kalman Filtering for Optimal State Estimation
The Kalman filter, introduced by Rudolf E. Kálmán in 1960, provides the optimal recursive
solution to discrete-data linear filtering problems. As detailed in the authoritative text re-
viewed by Bass, the filter gives minimum mean-square error state estimates based on noisy
measurements Bass (1996). Unlike ad-hoc techniques, it is mathematically proven optimal
under certain conditions (linear system dynamics, Gaussian noise, known noise statistics).

The filter operates in a predict-update cycle. In prediction, it forecasts the next state
using the system’s dynamic model. In update, it incorporates a new measurement by opti-
mally weighing prediction against measurement based on their respective uncertainties. This
weighting, determined by the Kalman gain computed from estimation error covariance, allows
the filter to "learn" appropriate trust levels over time.

For solar tracking, the characteristics align remarkably well with system requirements.
The filter’s optimal weighting distinguishes genuine sun position changes from random noise
fluctuations, producing smooth angle estimates even with highly variable sensor readings.
It can fuse information from four LDR sensors into a unified state estimate, more robust
than simple error signal calculation. During brief cloud cover when LDR readings become
unreliable, the filter relies more on its prediction model, preventing erratic movements. The
continuously updated estimates enable smoother motor control, reducing mechanical wear and
power consumption.

Mustafa et al. demonstrated that even software-implemented Kalman filtering on an
ESP32 provided tangible benefits MUSTAFA (2024). However, their work highlights the
computational challenge: the filter consumed significant processor resources, requiring opti-
mizations to maintain real-time performance alongside other tasks. The filter involves nu-


CHAPTER 2. LITERATURE REVIEW 7

merous matrix operations—multiplications, additions, inversions—in each update cycle. On
conventional microcontrollers, these execute sequentially. As filter dimensionality increases,
computational burden grows rapidly, forcing compromises in filter models, update rates, or
precision.

2.4 FPGAs for Hardware-Accelerated Filtering
Field-Programmable Gate Arrays represent a fundamentally different computing paradigm.
While microcontrollers follow von Neumann architecture—sequentially executing instructions—FPGAs
provide a fabric of configurable logic blocks and programmable interconnects arranged to cre-
ate custom digital circuits. As Woods explains, this architecture is particularly well-suited for
DSP algorithms benefiting from parallel, pipelined execution Woods et al. (2008).

For Kalman filtering, the advantages are substantial. Matrix operations required in each
iteration execute sequentially on microcontrollers—multiply a and b, store the result, multiply
c and d , add results. Total latency is the sum of individual operation latencies. On FPGAs,
these operations become parallel hardware circuits. Multiple multipliers operate simultane-
ously, and pipelining allows different computation stages to overlap. A well-designed FPGA
implementation completes an entire Kalman filter update in fixed clock cycles because all nec-
essary arithmetic units operate concurrently. This processing is deterministic—no operating
system, no interrupts, no context switching.

Empirical evidence strongly supports these advantages. Linares-Barranco et al.’s compari-
son of software filtering on CPUs versus FPGAs showed CPU implementations with 8-24 ms
latencies while FPGA achieved consistent 4-4.2 ms—a 188-570% reduction Linares-Barranco
et al. (2019). AlShabi et al. implemented an Unscented Kalman Filter—more complex than
standard Kalman—on an FPGA for target tracking, achieving real-time performance for an
algorithm that would severely strain microcontrollers AlShabi and Bonny (2022). The UKF
requires not only matrix operations but also sigma point generation and transformation, yet
FPGA hardware handled it efficiently.

However, FPGAs are not universally superior. They have higher costs, require special-
ized design skills (Verilog/VHDL), consume more power, and involve complex development
workflows. For simple tasks like reading sensors or toggling GPIO pins, microcontrollers are
more appropriate with simpler programming models and lower costs. But for computationally
intensive tasks like real-time Kalman filtering, the cost-performance equation shifts dramati-
cally toward FPGAs. Development complexity is justified by order-of-magnitude performance
improvements.

This leads to a critical insight: rather than choosing between microcontrollers and FPGAs,
optimal system design should leverage both, assigning tasks to components best suited for
them. This heterogeneous approach—microcontrollers for I/O and control, FPGAs for inten-
sive computation—represents a middle ground combining accessibility with high performance.

2.5 Research Gap and Opportunity
A comprehensive survey of solar tracking implementations reveals clear bifurcation with min-
imal middle ground. The overwhelming majority of practical systems use single microcon-
trollers (Arduino or ESP32) as the sole computational element, prioritizing simplicity and
cost-effectiveness Mohanapriya et al. (2021); MUSTAFA (2024). Advanced academic imple-
mentations demonstrate sophisticated algorithms on high-performance hardware like FPGAs
AlShabi and Bonny (2022), but remain laboratory prototypes due to complexity and lack of


CHAPTER 2. LITERATURE REVIEW 8

practical integration.
Notably absent are systems integrating FPGAs as specialized co-processors within acces-

sible architectures. BaBars et al.’s work represents a partial step, using an FPGA (Xilinx
Spartan) as the core controller with IoT connectivity BaBars et al. (2025). However, their
design uses the FPGA as a monolithic controller replacement, handling all tasks from sensor
acquisition to motor control to communication. This misses the opportunity for optimal task
specialization that heterogeneous architecture provides—using an FPGA for reading analog
sensors or toggling motor pins is like using a supercomputer for word processing.

This gap can be summarized across three dimensions:

• Hardware architecture: end-user systems favor single-MCU designs, while research
favors FPGA-centric prototypes; accessible systems rarely use an FPGA as a dedicated
compute co-processor.

• Algorithm realization: practical designs rely on basic control and lightweight filtering,
whereas research demonstrates advanced Kalman variants; translating these algorithms
into deployable, real-time embedded pipelines remains uncommon.

• System integration: many works are either all-in-one MCU implementations MUSTAFA
(2024) or FPGA-as-main-controller designs BaBars et al. (2025), leaving limited em-
phasis on task-partitioned, maintainable heterogeneous co-design.

The gap represents both a limitation and an opportunity for innovation. A heterogeneous
architecture combining Arduino (for I/O), FPGA (for filtering), and ESP32 (for connectivity)
would leverage each component’s strengths while avoiding weaknesses. The Arduino provides
reliable, low-latency sensor acquisition and motor control without operating system overhead.
The FPGA delivers order-of-magnitude performance improvement for Kalman filtering without
unnecessary complexity for simple tasks. The ESP32 offers seamless wireless connectivity and
web serving without consuming computational resources needed for control. This synergistic
approach represents the logical next step—bridging the gap between accessible, practical
systems and research-grade signal processing performance.

2.6 Proposed Heterogeneous Architecture
The analysis converges on a clear conclusion: the optimal solar tracking system requires
heterogeneous architecture strategically assigning tasks based on component strengths.

The Arduino Mega serves as the sensor acquisition and actuation hub. Its 10-bit ADCs
provide adequate resolution for LDR readings, and its real-time execution model ensures deter-
ministic sampling timing. The Arduino implements PID control loops driving motors based on
filtered angle estimates from the FPGA. This plays to the Arduino’s strengths—robust hard-
ware interfacing, predictable timing, straightforward control logic—while avoiding its weakness
in complex numerical computation. By offloading Kalman filtering to the FPGA, the Arduino
maintains high-frequency sensor sampling and rapid control loop execution without compro-
mise.

The DE1-SoC FPGA implements dual Kalman filters—one for azimuth, one for eleva-
tion—in fully parallel, pipelined hardware. The filters receive raw LDR data from the Arduino
via serial interface, perform optimal estimation using fixed-point arithmetic (Q9.7 format
for 0-180° servo range), and transmit filtered angle estimates back at high rates. Parallel
arithmetic units perform matrix operations simultaneously, achieving microsecond rather than
millisecond latencies. Deterministic timing ensures consistent filtering performance regardless


CHAPTER 2. LITERATURE REVIEW 9

of communication load. The FPGA is used exclusively for what it does best—intensive, par-
allel numerical computation—maximizing cost-effectiveness by ensuring capabilities are fully
utilized for the specific problem it uniquely solves.

The ESP32 hosts the wireless interface and web dashboard. It receives filtered angle esti-
mates and system status from the Arduino, serves a real-time web interface displaying tracking
performance, and provides remote control capabilities. The ESP32’s powerful WiFi stack and
sufficient processing power make it ideal for this role. Crucially, it is freed from computational
burden of running Kalman filters or maintaining precise control timing, dedicating resources
entirely to communication and user interface tasks.

The architecture’s benefits exceed the sum of parts. By distributing computational tasks
across specialized hardware, the system achieves performance levels impossible for any single
microcontroller. The Kalman filter runs at FPGA speeds (microsecond latency), control loops
execute at Arduino real-time rates (millisecond update cycles), and the web dashboard oper-
ates smoothly without impacting tracking performance. Clear separation of concerns simplifies
development and debugging. The system is scalable—adding sensors or sophisticated control
strategies on Arduino doesn’t affect FPGA filtering performance, and extending Kalman filters
doesn’t impact Arduino control timing or ESP32 communication.

Most fundamentally, this approach bridges the identified gap between accessible end-user
systems and research-grade performance. Arduino and ESP32 components keep the system
approachable with familiar programming models and extensive community support. The FPGA
component brings research-level signal processing into a practical, deployable system. Users
gain benefits of hardware-accelerated Kalman filtering (optimal noise rejection, sensor fusion,
predictive capability) without sacrificing accessibility and user-friendliness of conventional de-
signs.

Unlike all-in-one microcontroller systems, it forces no computational compromises Mo-
hanapriya et al. (2021); MUSTAFA (2024). Unlike pure FPGA implementations, it doesn’t use
expensive hardware for tasks simple microcontrollers handle effectively BaBars et al. (2025).
Unlike research systems focusing on algorithm implementation without practical deployment
considerations AlShabi and Bonny (2022), it integrates high-performance filtering within a
complete system with user interfaces, remote monitoring, and straightforward replication.

2.7 Summary
This literature review traced solar tracking system evolution from simple controllers to so-
phisticated IoT platforms, analyzed Kalman filtering foundations and benefits, demonstrated
FPGA architectural advantages for real-time signal processing, and identified a critical gap in
existing implementations.

Current systems occupy two extremes. End-user designs prioritize accessibility by using
single microcontrollers for all functions, but encounter computational bottlenecks and perfor-
mance trade-offs as complexity increases. Research implementations demonstrate advanced
algorithms and high-performance hardware but remain impractical for deployment due to spe-
cialization and lack of integration with user-friendly components.

The identified gap—absence of practical systems integrating hardware-accelerated signal
processing with accessible microcontroller-based components—represents both a limitation
and an opportunity. The proposed heterogeneous architecture, strategically distributing tasks
among Arduino (sensing and actuation), FPGA (Kalman filtering), and ESP32 (wireless con-
nectivity), directly addresses this gap by combining conventional embedded system accessibility
with specialized signal processing hardware performance.


CHAPTER 2. LITERATURE REVIEW 10

Expected contributions extend beyond demonstrating another solar tracker. By success-
fully integrating FPGA-based Kalman filtering into a practical, user-friendly system, this
project establishes a model for future embedded system designs leveraging component spe-
cialization rather than forcing monolithic solutions. It demonstrates that benefits of advanced
signal processing algorithms—proven in theory and demonstrated in isolation—can be brought
to practical applications through thoughtful architectural design matching computational tasks
to optimal hardware platforms.

Ultimately, this work aims to show that the choice between accessible simplicity and
computational sophistication is a false dichotomy. Through heterogeneous system design,
both goals can be achieved simultaneously, creating solar tracking systems that are practical
to build and deploy while delivering research-grade filtering performance. This represents the
natural evolution of embedded system design for applications where sensing, signal processing,
control, and communication all play critical roles.


Chapter 3

Methodology

3.1 Overall System Architecture and Block Diagram
The proposed solar tracking system employs a heterogeneous architecture that strategically
distributes computational tasks across three specialized hardware platforms: Arduino Mega for
sensor acquisition and actuation, DE1-SoC FPGA for hardware-accelerated Kalman filtering,
and ESP32 for wireless connectivity and web-based monitoring. This architecture is designed
to overcome the fundamental trade-offs that limit monolithic microcontroller implementations
while maintaining practical deployability.

11


CHAPTER 3. METHODOLOGY 12

Figure 3.1: System-level block diagram showing the heterogeneous data flow be-
tween Arduino (sensor/actuation layer), DE1-SoC FPGA (computation layer), and
ESP32 (communication layer)

Figure 3.1 illustrates the complete system architecture and the strategic division of la-
bor among the three computational platforms. The data flow follows a carefully orches-
trated pipeline designed to maximize the strengths of each component while minimizing inter-
component communication overhead.

The Arduino Mega serves as the front-end sensor acquisition and actuation controller. It
interfaces with four Light Dependent Resistors (LDRs) arranged in a quadrant configuration
to capture sunlight intensity from multiple directions. The Arduino implements a PID control
algorithm that computes raw azimuth and elevation angle commands based on the differential


CHAPTER 3. METHODOLOGY 13

LDR readings. Rather than directly commanding the servo motors with these potentially noisy
estimates, the Arduino transmits the raw angle data to the FPGA for filtering. This design
decision is crucial: by offloading the computationally intensive Kalman filter to dedicated
hardware, the Arduino maintains a fast, deterministic control loop without time-slicing between
sensing, filtering, and actuation tasks.

The DE1-SoC FPGA forms the computational core of the system. It receives raw an-
gle measurements from the Arduino via UART serial communication at 115200 baud. The
FPGA fabric implements dual Kalman filters—one for azimuth and one for elevation—in fully
parallel, pipelined hardware. These filters perform optimal state estimation using fixed-point
arithmetic (Q9.7 format, providing 16-bit signed representation with 7 fractional bits, suitable
for the 0° to 180° servo range). The filtered angle estimates are transmitted back to the
Arduino through the same UART link within microseconds, enabling real-time closed-loop
control with research-grade signal processing performance. The FPGA’s Hard Processor Sys-
tem (HPS), featuring a dual-core ARM Cortex-A9 running at 925 MHz, plays a critical role
during system initialization by programming the FPGA fabric via AXI bridges and providing
future extensibility for advanced monitoring capabilities.

The ESP32 microcontroller handles all wireless communication and user interface function-
ality. It receives system telemetry from the Arduino via a second UART link operating at 9600
baud, including filtered and raw angle data, LDR sensor values, servo positions, power con-
sumption metrics, and system statistics. The ESP32 operates as a WiFi Access Point, hosting
a responsive web dashboard that provides real-time visualization, remote parameter tuning,
and manual control capabilities. By dedicating the ESP32 exclusively to communication tasks,
the system avoids the performance degradation that occurs when a single microcontroller must
handle sensing, filtering, control, and wireless communication concurrently.

This heterogeneous architecture achieves several critical objectives. First, it breaks the
performance ceiling imposed by sequential processing limitations—the Kalman filter executes
in parallel hardware at FPGA clock speeds while the Arduino control loop maintains millisec-
ond update rates without interference. Second, it maintains accessibility—the Arduino and
ESP32 use familiar programming environments (Arduino IDE and C/C++), while only the
computationally critical filtering component requires hardware description language (Verilog)
implementation. Third, it provides modularity and scalability—each subsystem can be inde-
pendently developed, tested, and enhanced without affecting the others. Finally, it demon-
strates a practical model for integrating advanced DSP algorithms into embedded systems
without requiring monolithic high-performance solutions that increase cost and complexity
unnecessarily.

3.2 Hardware Subsystem Design
3.2.1 Sensor and Actuation Layer (Arduino Domain)
The Arduino Mega 2560 microcontroller serves as the sensor acquisition and motor control
platform, chosen for its abundant I/O capabilities, mature ecosystem, and deterministic real-
time execution characteristics. The board provides 16 analog input channels with 10-bit
resolution (0-1023 ADC range), sufficient for interfacing with Light Dependent Resistor (LDR)
sensors that exhibit resistance variations from approximately 1 kΩ in bright light to over 10
MΩ in darkness.


CHAPTER 3. METHODOLOGY 14

Figure 3.2: Arduino Mega connection diagram showing LDR sensor interfacing,
servo motor control, and UART communication links to FPGA and ESP32

Figure 3.2 details the Arduino’s interface connections. Four LDR sensors are connected
to analog inputs A0 (top-left), A1 (top-right), A2 (bottom-left), and A3 (bottom-right), each
configured in a voltage divider network with a 10 kΩ fixed resistor to ground. This arrangement
produces a voltage output proportional to light intensity. Two servo motors—one for azimuth
(horizontal) rotation and one for elevation (vertical) tilt—are controlled via PWM signals on
digital pins 3 and 4, generating standard 50 Hz signals with pulse widths from 1000 to 2000
microseconds for 0° to 180° angular range.

Dual-UART Communication Architecture

The Arduino implements two serial communication channels. Serial3 (pins 14/15, operating
at 115200 baud) connects to the FPGA for real-time Kalman filtering. Raw angle data
computed by the PID controller is transmitted in Q9.7 fixed-point format as four-byte packets:
[AZ_HIGH][AZ_LOW][EL_HIGH][EL_LOW]. The Arduino receives filtered estimates in the
same format within approximately 2 milliseconds. Serial1 (pins 18/19, operating at 9600 baud)
connects to the ESP32, transmitting ASCII-formatted telemetry containing sensor readings,
servo positions, angle data, and system statistics for dashboard display.

PID Control Implementation

The Arduino firmware uses separate PID controllers for azimuth and elevation. Every 100
ms, it reads the LDR sensors and calculates horizontal and vertical errors by comparing the
average light on each side. The PID controllers use tuned gains for proportional, integral, and
derivative terms to generate raw angle commands.


CHAPTER 3. METHODOLOGY 15

When the FPGA Kalman filter is enabled, the Arduino transmits these raw angles in Q9.7
fixed-point format to the FPGA via Serial3 (115200 baud). Within approximately 2 millisec-
onds, the FPGA responds with filtered angle values in the same Q9.7 format. The Arduino
then converts these filtered angles back to integer degrees and commands the servos directly.
This entire transaction—transmission, FPGA filtering (including Kalman filter computation
in hardware), and FPGA transmission back to Arduino—completes well within the 100 ms
control loop cycle, ensuring deterministic real-time operation.

If the FPGA times out (no response within 100 ms) or the filter is disabled, the Arduino
uses the raw PID output directly to command the servos, providing a fallback that maintains
system responsiveness.

To improve reliability, the code includes integral windup protection, derivative filtering
(using a moving average), a deadband to ignore small errors, and logic to detect when ser-
vos reach their limits. These refinements help prevent overshoot, reduce jitter, and extend
hardware life.

Operational Modes and Filter Integration

The system has two modes: AUTO and MANUAL. In AUTO, the Arduino reads LDR sensors,
runs a PID controller, and sends raw angles to the FPGA for Kalman filtering. If the FPGA
responds in time, the filtered angles are used to move the servos; otherwise, the raw angles
are used. In MANUAL, the user sets servo positions directly from the ESP32 dashboard.

When switching to AUTO with filtering, the Arduino sends several packets with the current
position to let the FPGA filter sync up and avoid jumps.

Statistics and Telemetry

The Arduino tracks basic stats like servo movements, peak light, and lock time, and sends
these to the ESP32 for dashboard display.

This setup keeps the Arduino fast and responsive by offloading filtering to the FPGA.

3.2.2 Computation Layer (DE1-SoC FPGA Domain)
The DE1-SoC development board, featuring an Intel Cyclone V SoC FPGA, forms the compu-
tational heart of the proposed system. This platform was selected for its unique combination
of FPGA fabric and Hard Processor System (HPS), which together provide both the paral-
lel processing capabilities required for real-time Kalman filtering and the software flexibility
needed for system configuration and future extensibility.

Rationale for DE1-SoC Platform Selection

The Cyclone V SoC device on the DE1-SoC board integrates 85,000 logic elements (LEs),
4,450 Kbits of embedded memory, 87 DSP blocks, and a dual-core ARM Cortex-A9 processor
subsystem operating at 925 MHz. This heterogeneous architecture is crucial for the proposed
system design, as it enables a clear separation of concerns: the FPGA fabric implements
high-speed, deterministic signal processing, while the HPS manages system initialization, con-
figuration, and monitoring.

The HPS-FPGA integration via high-bandwidth AXI bridges is particularly important for
this project. During system startup, the HPS configures the FPGA fabric by loading the
Raw Binary File (RBF) through Passive Parallel configuration mode. This approach provides
several advantages over standalone FPGA designs: (1) it eliminates the need for external


CHAPTER 3. METHODOLOGY 16

configuration memory or JTAG programmer during normal operation, (2) it allows for dynamic
reconfiguration if future enhancements require different FPGA configurations, (3) it provides a
path for the HPS to monitor FPGA operation via memory-mapped PIO ports, and (4) it offers
potential for hybrid hardware-software algorithms where the HPS performs preprocessing or
post-analysis on FPGA results.

For the current implementation, the HPS operates primarily in a supervisory role. After
programming the FPGA fabric on startup, it monitors four 16-bit PIO (Parallel I/O) ports
connected to the FPGA fabric that carry raw and filtered azimuth and elevation angle values.
While these ports are not actively used in the current system’s primary data path (which
flows directly from Arduino to FPGA to Arduino via UART), they provide future extensibility
for HPS-based data logging, anomaly detection, or advanced diagnostic capabilities without
modifying the real-time critical path.

The passive parallel configuration mode employed on the DE1-SoC board is significant for
deployment. In this mode, the FPGA fabric appears as a passive device to the HPS, which
actively drives the configuration data and control signals. This contrasts with active serial
configuration where the FPGA itself fetches configuration data from external memory. The
passive approach is more robust in the field, as it ensures the HPS can always recover and
reconfigure the FPGA if transient faults occur, providing a degree of fault tolerance absent in
simpler FPGA-only designs.


CHAPTER 3. METHODOLOGY 17

FPGA System Architecture

Figure 3.3: Cyclone V SoC top-level block diagram showing the FPGA fabric, HPS
subsystem, and their interconnections via high-performance AXI bridges

Figure 3.3 presents the high-level architecture of the Cyclone V SoC device. The FPGA fabric
occupies the majority of the die area and is where the dual Kalman filter cores, Avalon-MM
interconnect fabric, UART controller, and custom interface bridges are implemented. The
HPS subsystem contains the dual-core ARM Cortex-A9 MPCore processor, on-chip memory,
memory controllers for external DDR3 SDRAM, and a rich set of hard IP peripherals including
Ethernet, USB, SD card, and SPI controllers.

The lightweight HPS-to-FPGA and FPGA-to-HPS AXI bridges provide low-latency, high-
bandwidth communication paths between the processor subsystem and the programmable
logic. These 32-bit or 64-bit wide buses operate at frequencies up to 125 MHz, enabling the
HPS to access memory-mapped registers in the FPGA fabric with latencies comparable to


CHAPTER 3. METHODOLOGY 18

on-chip peripheral access. For this project, the HPS uses the HPS-to-FPGA bridge during
the configuration phase to load the RBF into the FPGA fabric, and it monitors the PIO ports
through the same bridge during runtime.

The FPGA fabric implementation consists of several major functional blocks:
Kalman Filter Cores: Two instantiated kalman_scalar modules (implemented in the

file kalman_scalar.v) form the signal processing heart of the system. Each filter operates
independently on one degree of freedom—azimuth or elevation—implementing the full predict-
update cycle of the Kalman filter algorithm. The filters use Q9.7 fixed-point arithmetic, which
provides sufficient precision for the 0° to 180° servo angle range while enabling compact
hardware implementation. The filters include adaptive measurement noise covariance (R)
scaling to handle varying sensor noise conditions, particularly during partly cloudy periods
when LDR readings fluctuate rapidly.

Kalman to Avalon Bridge: The custom Kalman_Avalon_Bridge module (implemented
in kalman_2_avalon_bridge.v) serves as the interface between the UART controller’s
Avalon-MM slave interface and the parallel Kalman filter inputs/outputs. This bridge im-
plements a finite state machine that performs the following operations: (1) polls the UART
status register to detect received data availability, (2) reads four-byte angle packets from the
UART receive FIFO, (3) assembles the bytes into Q9.7 azimuth and elevation values, (4)
triggers the Kalman filter cores and waits for filtered results, and (5) transmits the four-byte
filtered angle packet back through the UART transmit FIFO. The bridge includes timeout
mechanisms to ensure system robustness—if the Kalman filter computation exceeds a thresh-
old (currently 10 ms, though typical latency is under 500 microseconds), the bridge transmits
the raw input values as a fallback, ensuring the Arduino control loop never stalls.

Avalon Memory-Mapped Interconnect: The Avalon-MM bus fabric, generated by In-
tel’s Platform Designer (Qsys) tool, provides a hierarchical interconnect between the various
FPGA subsystem components. This standard on-chip bus protocol simplifies the integration
of IP cores and custom logic, as all components speak a common interface language. The
interconnect handles address decoding, data width adaptation, clock domain crossing where
necessary, and arbitration for shared resources.

UART Controller: An Intel-provided RS-232 UART IP core implements the serial com-
munication interface. Configured for 115200 baud, 8N1 format, the UART presents an Avalon-
MM slave interface with memory-mapped registers for transmit data, receive data, and status.
The UART includes small FIFOs (typically 4-8 bytes) to buffer transmitted and received char-
acters, reducing the real-time response requirements on the controlling logic. The UART’s
physical layer connects to GPIO_0 pins on the DE1-SoC board (specifically pins designated
for TX and RX), which are then connected through a 3.3V to 5V logic level shifter to the
Arduino’s 5V TTL serial port.

PIO Ports: Four 16-bit Parallel I/O (PIO) cores are instantiated to make the raw and
filtered angle values visible to the HPS. These ports are configured as inputs from the HPS
perspective, allowing the ARM processor to read the current angle estimates at any time
by performing Avalon-MM read transactions to the appropriate memory-mapped addresses.
While not part of the primary real-time data path in the current implementation, these ports
provide a crucial extensibility mechanism for future enhancements such as HPS-based data
logging to SD card, web-based real-time monitoring served directly from the HPS (bypassing
the ESP32), or implementation of higher-level control strategies that blend hardware (FPGA)
and software (HPS) processing.


CHAPTER 3. METHODOLOGY 19

Platform Designer (Qsys) System Integration

Figure 3.4: Platform Designer (Qsys) system interconnection diagram showing the
HPS, UART controller, PIO ports, and custom Avalon bridge

Figure 3.4 illustrates the complete system as constructed in Intel’s Platform Designer (for-
merly Qsys) tool. This graphical system integration environment allows hardware designers
to instantiate pre-verified IP cores, define custom components with Avalon-MM or AXI inter-
faces, specify interconnections, and generate the necessary HDL and software header files for
the complete system.

The major components visible in the diagram are:
System PLL: A Phase-Locked Loop clock source that generates the 50 MHz system clock

from the board’s input clock. All Avalon-MM interconnect transactions and most IP cores
operate synchronously to this clock domain.

ARM Cortex-A9 HPS: The Hard Processor System block exposes its AXI master inter-
faces (HPS-to-FPGA bridge) to the Avalon-MM interconnect. This allows the HPS software
to access any memory-mapped peripheral in the FPGA fabric using standard pointer derefer-
encing in C code, as the AXI-to-Avalon bridge handles protocol conversion automatically.

Intel UART (RS-232 Serial Port): The UART IP core’s Avalon-MM slave interface
connects to the interconnect fabric, making its control and data registers accessible at a specific
base address (assigned during system generation). The UART’s external signal interface
connects to pins on the FPGA I/O bank designated as GPIO_0.

PIO Ports (4 instances): Four separate PIO IP cores, each configured as 16-bit input


CHAPTER 3. METHODOLOGY 20

ports (from the HPS/interconnect perspective), connect to the Avalon-MM bus. These ports
are directly wired in HDL to signals from the Kalman filter output and the bridge’s input
registers, making the angle values readable by the HPS.

External Bus to Avalon Bridge: The custom Kalman_Avalon_Bridge module appears
as an Avalon-MM master in the system. As a master, it initiates read and write transactions
to the UART’s slave interface. The Platform Designer tool automatically instantiates the nec-
essary interconnect logic to route these transactions, handle clock domain crossing if needed,
and manage arbitration if multiple masters exist.

The hierarchical nature of this system illustrates the power of the Avalon-MM paradigm:
complex functionality can be built from well-defined, independently verifiable components
connected through a standard interface. This modularity significantly simplified system de-
velopment, as each block (Kalman filter, bridge, UART) could be tested in isolation before
integration.

Memory Map and Address Decoding

Figure 3.5: Platform Designer address map showing base addresses and spans for
all memory-mapped peripherals in the FPGA fabric

Figure 3.5 presents the memory map generated by Platform Designer for the system. Under-
standing this map is crucial for software development on the HPS, as it defines the physical
addresses the ARM processor must use to access each peripheral.

The UART controller occupies a small address span (typically 32 bytes is sufficient for the
handful of registers: RXDATA, TXDATA, STATUS, and CONTROL). Within the UART’s address
space, the standard register offsets are:

• RXDATA (offset 0x0000): Read-only register that returns the next byte from the receive
FIFO and automatically dequeues it.

• TXDATA (offset 0x0004): Write-only register that enqueues a byte into the transmit
FIFO.

• STATUS (offset 0x0008): Read-only register with bit flags indicating RRDY (receive data
available), TRDY (transmit buffer ready), and error conditions.

• CONTROL (offset 0x000C): Write-only register for configuring baud rate, parity, and
enabling interrupts.

The Kalman_Avalon_Bridge module’s firmware (implemented as an FSM in Verilog)
uses these offsets relative to the UART’s base address to poll for received data, read angle
bytes, and transmit filtered results. The bridge’s state machine follows this sequence: (1)
read STATUS register, (2) check RRDY bit, (3) if set, read RXDATA register to get one byte,
(4) repeat until four bytes assembled, (5) trigger Kalman filter, (6) wait for filter completion,
(7) read STATUS register, (8) check TRDY bit, (9) if set, write one filtered angle byte to
TXDATA register, (10) repeat until four bytes transmitted.

The four PIO ports each occupy a 16-byte span, though they use only the first 4 bytes
(one 32-bit Avalon-MM word, of which only 16 bits are valid). The HPS can read these


CHAPTER 3. METHODOLOGY 21

ports at any time by performing a memory-mapped read at the appropriate base address. For
example, if the azimuth raw value PIO has base address 0xFF200000, the HPS can execute:

volatile uint32_t *az_raw_ptr = (uint32_t *)0xFF200000;
uint16_t azimuth_raw = (uint16_t)(*az_raw_ptr);

This memory-mapped approach provides a simple, efficient interface between hardware
and software, avoiding the complexity of DMA setups or interrupt handling for this monitoring
application.

The address map also reveals the HPS memory controller’s address range, which manages
the external 1 GB DDR3 SDRAM. While not directly used by the current Kalman filtering
pipeline, this memory is available for future enhancements such as storing long-term tracking
history, buffering sensor data for offline analysis, or implementing more sophisticated prediction
models that require large lookup tables.

In summary, the DE1-SoC FPGA subsystem provides a highly capable, extensible platform
for real-time signal processing. The careful division of responsibilities—FPGA fabric for time-
critical filtering, HPS for configuration and monitoring—combined with industry-standard
interconnect and memory-mapped I/O, results in a system that is both high-performance and
maintainable. The modular architecture, enabled by Platform Designer and the Avalon-MM
protocol, allows for rapid iteration and future enhancement without requiring redesign of the
entire system.

3.3 Kalman Filter Hardware Implementation
3.3.1 Algorithm Selection and Adaptation
The core signal processing component of this project is an adaptive scalar Kalman filter specif-
ically designed for solar tracking applications. The choice of a scalar (single-state) filter rather
than a multi-dimensional variant is justified by the tracking problem’s structure: azimuth and
elevation angles evolve independently and can be treated as separate estimation problems.
This simplification reduces hardware complexity while maintaining optimal performance for
the application.

The filter implements the standard discrete Kalman filter equations but incorporates a
critical adaptation: dynamic measurement noise covariance (R) scaling based on innova-
tion magnitude. This adaptive strategy addresses a fundamental challenge in solar track-
ing—distinguishing between legitimate sun position changes and transient noise spikes caused
by clouds, reflections, or sensor glitches. A fixed-parameter Kalman filter would either track
noise (if R is too small) or respond sluggishly to real changes (if R is too large). The adaptive
approach implemented here adjusts R dynamically: when innovation (the difference between
measurement and prediction) remains small, R stays at its baseline value and the filter trusts
the measurements. When innovation spikes dramatically, R increases proportionally, causing
the filter to rely more heavily on its prediction model and reject the spike.

This adaptive mechanism operates in two tiers. For moderate deviations, a quadratic
scaling function gradually increases R, providing smooth noise rejection while maintaining
responsiveness. For extreme spikes exceeding 3.75° above the baseline innovation average,
a hard rejection threshold activates, scaling R by a factor of 640 to effectively ignore the
measurement entirely. This two-tier strategy ensures robust performance across diverse con-
ditions—from clear sky tracking (minimal adaptation) to partly cloudy scenarios (frequent
moderate spikes) to severe transient disturbances (hard rejection of outliers).


CHAPTER 3. METHODOLOGY 22

The implementation is particularly well-suited to the solar tracking context. Sun posi-
tion changes gradually and predictably during normal operation, producing small innovations.
Cloud shadows and reflections, conversely, create large, sudden deviations. The adaptive filter
exploits this distinction, achieving approximately 70% noise reduction across test scenarios
while maintaining fast response to legitimate position changes.

3.3.2 Fixed-Point Numerical Representation
Hardware efficiency in FPGA implementations is critically dependent on numerical representa-
tion choices. Floating-point arithmetic, while offering wide dynamic range and high precision,
consumes substantial logic resources and introduces pipeline latency. For the solar tracking
application, the required angle range is limited (0° to 180° for servo motors) and precision re-
quirements are modest (sub-degree accuracy is sufficient). These constraints enable the use of
fixed-point arithmetic, which provides dramatic resource savings with negligible performance
impact.

The implemented system employs Q9.7 format—a 16-bit signed fixed-point representation
with 9 integer bits and 7 fractional bits. This format provides a numerical range of -256.0 to
+255.99° with a resolution of 0.0078°. The 9 integer bits easily accommodate the servo mo-
tor’s 0-180° range with margin for intermediate calculations. The 7 fractional bits provide 128
discrete levels per degree, yielding precision of approximately 0.5 arcminutes—far exceeding
the mechanical accuracy of typical servo motors.

The Q9.7 format’s efficiency stems from its hardware implementation. Multiplications
require 16-bit×16-bit integer multipliers (available as hard IP blocks in the Cyclone V FPGA)
followed by a simple right-shift operation to extract the Q9.7 result from the Q18.14 inter-
mediate product. Additions and subtractions operate directly on Q9.7 values with overflow
detection. Division, the most resource-intensive operation, is required only once per filter
iteration (to compute Kalman gain K = P/(P+R)) and is implemented using a pipelined
restoring division algorithm that completes in 18 clock cycles.

To illustrate the format’s adequacy, consider a typical scenario: the sun moves approx-
imately 15° per hour, or 0.25° per minute. With the control loop operating at 10 Hz, the
expected position change between samples is 0.0004°. The Q9.7 format’s 0.0078° resolution
is thus nearly 20× finer than the signal being tracked, ensuring quantization noise remains
negligible. For the Kalman filter’s internal calculations, the format supports covariance values
up to 255, easily accommodating the initialization value of 2.0 and typical steady-state values
below 0.5.


CHAPTER 3. METHODOLOGY 23

3.3.3 Hardware Architecture and Parallelization

Figure 3.6: Kalman filter top-level RTL block diagram showing dual independent
filter instances and external interface signals

Figure 3.6 presents the top-level architecture of the Kalman filter subsystem. The design in-
stantiates two independent kalman_scalar modules—one for azimuth angle estimation and
one for elevation angle estimation. This parallel instantiation is a key architectural decision:
because azimuth and elevation are mechanically and mathematically independent, they can
be filtered simultaneously without resource sharing. The dual-filter approach provides several
benefits: (1) it eliminates the need for time-multiplexing a single filter between axes, thereby
doubling effective throughput; (2) it simplifies the control logic by avoiding the state manage-
ment required for multiplexing; and (3) it enables truly parallel processing, where both axes
converge to optimal estimates simultaneously.

Each filter instance receives identical inputs—raw angle measurement, process noise co-
variance Q, measurement noise covariance R, and enable signals. The filters operate inde-
pendently and generate their respective filtered outputs along with status signals (converged,
error flags). This architecture exemplifies the "spatial parallelism" paradigm of FPGAs: rather
than serially processing azimuth then elevation, both execute concurrently in separate logic
regions of the FPGA fabric.


CHAPTER 3. METHODOLOGY 24

FSM-Driven Control Path

Figure 3.7: Kalman filter finite state machine showing the sequential control flow
through predict, update, and output phases


CHAPTER 3. METHODOLOGY 25

Figure 3.7 illustrates the finite state machine that orchestrates the Kalman filter’s operation.
The FSM implements a classic control-datapath separation: the state machine determines
which operations execute and when, while the datapath (arithmetic units, registers) performs
the actual computations. This separation simplifies verification and enables independent op-
timization of control logic and arithmetic circuits.

The state machine begins in IDLE, awaiting a start signal that indicates new measurement
data is available. Upon activation, it transitions through several key states that correspond
directly to the Kalman filter algorithm:

PREDICT: In this state, the state estimate is propagated forward in time using the system
model. For the solar tracking application, the model assumes quasi-static behavior (the sun
position changes slowly between samples), so the prediction simply copies the previous esti-
mate: xpred = xest . The error covariance, however, must increase to reflect the uncertainty
introduced by the time step: Ppred = Pest + Q. The innovation (difference between measure-
ment and prediction) is also computed in this state: innovation = measurement − xpred .
These operations can execute in parallel, exploiting the FPGA’s spatial architecture.

CALC_R: The adaptive noise rejection logic activates here. Using the innovation magni-
tude calculated in the previous state, the filter determines whether the measurement appears
consistent with the prediction (small innovation) or represents a potential outlier (large in-
novation). The effective measurement noise R is scaled accordingly: for small innovations,
R remains at its baseline value; for moderate excesses, R increases quadratically with the
innovation magnitude; for extreme spikes, R is multiplied by 640 to effectively ignore the
measurement. This computation uses the innovation’s absolute value, the running average of
past innovations (maintained as innov_avg), and several threshold comparisons to select the
appropriate scaling factor.

START_DIV / WAIT_DIV: Kalman gain computation requires division: K = Ppred/(Ppred+
R). The FSM initiates the pipelined division unit in START_DIV and then transitions to
WAIT_DIV, where it remains until the divider signals completion. The 18-cycle latency of
the divider represents the primary bottleneck in the filter’s overall throughput, but it is un-
avoidable for this operation. The division is implemented using a restoring division algorithm
that trades latency for hardware simplicity and numerical accuracy.

UPDATE: With the Kalman gain available, the state estimate is updated by incorporating
the innovation: xest = xpred + K × innovation. This state also includes saturation logic to
ensure the result remains within valid bounds (0° to 180° for servo motors).

UPDATE_P: The error covariance is updated to reflect the information gained from the
measurement: Pest = Ppred −K ×Ppred . A minimum covariance floor is enforced to prevent
numerical underflow and maintain filter stability.

OUTPUT: The final state applies clamping to the estimate (ensuring it falls within the
servo motor’s physical range), asserts the done signal to notify external logic that results are
ready, and increments the sample counter used for tracking convergence.

The FSM also includes a BYPASS state, activated when the filter is disabled, that simply
passes the raw measurement through without filtering. This mode is essential for system
testing and comparison experiments.

Parallel Datapath

While the FSM enforces sequential execution of high-level operations (you cannot compute K
before computing Ppred +R), the datapath exploits parallelism within each state. For example,
during the PREDICT state, three independent calculations occur simultaneously: (1) xpred =
xest (a simple register copy), (2) Ppred = Pest + Q (an addition), and (3) innovation =


CHAPTER 3. METHODOLOGY 26

measurement − xest (a subtraction). These operations have no data dependencies, so the
FPGA synthesizes them as parallel arithmetic circuits that all complete within a single clock
cycle.

The design instantiates dedicated arithmetic modules for key operations:
fp_add_sat and fp_sub_sat: These modules perform saturating addition and sub-

traction in Q9.7 format. Saturation logic detects overflow/underflow conditions and clamps
results to the maximum/minimum representable values, preventing wraparound errors that
would corrupt the filter state.

fp_multiply: Fixed-point multiplication generates a 32-bit intermediate result (Q18.14
format) which is then shifted right by 7 bits and saturated to produce a Q9.7 output. The
multiplier uses the FPGA’s hard DSP blocks for efficiency.

fp_divide_fast: The division module implements a 16-bit restoring division algorithm
optimized for the Kalman gain calculation where the result is always between 0 and 1 (since P
and R are both positive, P/(P + R) < 1). This constraint enables optimizations that reduce
the divider’s logic footprint.

By implementing these operations as combinational logic (for add/multiply) or pipelined
sequential logic (for division), the datapath achieves maximum throughput given the algorith-
mic constraints. The filter’s overall latency—from receiving a measurement to producing a
filtered output—is approximately 25 clock cycles at 50 MHz, corresponding to 500 nanosec-
onds. This is three orders of magnitude faster than a software implementation on the Arduino
or ESP32 microcontrollers.

Critical Component: The Kalman-to-Avalon Bridge

Figure 3.8: External bus to Avalon bridge timing diagram showing the standard
Avalon-MM read and write cycles (adapted from Intel University Program IP doc-
umentation)


CHAPTER 3. METHODOLOGY 27

Figure 3.9: Kalman-to-Avalon bridge state machine showing UART polling, data
assembly, filter triggering, and result transmission phases

The Kalman filter cores operate using simple handshake signals (start, done) and parallel
data buses. However, the broader FPGA system—particularly the UART controller and the
HPS monitoring interface—communicates via the Avalon Memory-Mapped (Avalon-MM) bus
protocol. The Kalman_Avalon_Bridge module serves as the translator between these two
domains, and its correct operation is critical to system functionality.

Figure 3.8 illustrates the standard Avalon-MM protocol timing. Read and write transac-
tions consist of presenting an address and control signals (read or write enable, byte enable),
waiting for the slave device to assert acknowledge, and then capturing (for reads) or retiring
(for writes) the data. The protocol is synchronous to a single clock domain and uses a simple,
non-pipelined handshake that ensures reliable communication even across modules synthesized
by different tools or from different IP vendors Intel Corporation (2018).

Figure 3.9 presents the bridge’s state machine, which implements the following protocol:
Polling Phase (IDLE → CHECK_RRDY → WAIT_RRDY): The bridge continuously

polls the UART’s status register to detect when new data has arrived from the Arduino. It
issues an Avalon-MM read transaction to the STATUS register address, waits for acknowledg-
ment, and examines the RRDY (Receive Ready) bit. If no data is available, the FSM returns
to idle and retries after a short delay to avoid excessive bus traffic.

Data Reception Phase (READ_BYTE → WAIT_READ → ASSEMBLE): When
RRDY indicates data availability, the bridge reads one byte from the UART’s RXDATA reg-


CHAPTER 3. METHODOLOGY 28

ister. This byte is stored in an internal buffer. The process repeats until four bytes have
been assembled, corresponding to the two-byte azimuth value and two-byte elevation value
transmitted by the Arduino in Q9.7 format: [AZ_HIGH][AZ_LOW][EL_HIGH][EL_LOW].

Filter Triggering Phase (TRIGGER_KALMAN → WAIT_KALMAN): Once the
four-byte packet is complete, the bridge presents the azimuth and elevation values to the
Kalman filter inputs and asserts the angle_data_valid signal. It then waits for the filter to
complete processing. The filter’s filtered_valid signal indicates when results are ready. A
timeout mechanism (10 ms watchdog) is implemented to handle the pathological case where
the filter stalls; if triggered, the bridge falls back to transmitting the raw (unfiltered) values
to maintain system responsiveness.

Transmission Phase (CHECK_TRDY → WAIT_TRDY → WRITE_BYTE →
WAIT_WRITE → TRANSMIT_NEXT): The bridge polls the UART’s TRDY (Trans-
mit Ready) bit to ensure the transmit FIFO has space, then writes the filtered azimuth and
elevation values back to the Arduino as a four-byte packet using the same format. Each
byte write is an Avalon-MM transaction to the TXDATA register, with the FSM waiting for
acknowledgment before proceeding to the next byte.

This bridge exemplifies a common FPGA design pattern: wrapping custom logic (the
Kalman filter) with a standard bus interface (Avalon-MM) to enable integration into larger
systems. The bridge handles all timing, flow control, and protocol details, presenting a clean,
well-defined interface to both the filter and the UART. Its modular design also facilitates test-
ing—the bridge can be simulated independently using an Avalon-MM testbench that mimics
UART behavior.

3.3.4 Adaptive Noise Rejection Strategy
The adaptive R scaling mechanism is the filter’s most sophisticated feature and represents
the primary algorithmic contribution of this work. The strategy addresses the fundamental
challenge of distinguishing transient noise spikes from legitimate sun position changes in real-
time.

The adaptation operates on the innovation signal—the difference between the measured
angle and the filter’s predicted angle. In steady-state tracking under clear sky conditions,
innovation reflects only sensor noise and is typically small (under 2°). When a cloud shadow
or reflection occurs, innovation spikes dramatically (10-40°). The challenge is that these spikes
are statistically indistinguishable from a legitimate sudden change in sun position (which could
occur if, for example, the system is manually adjusted or if the tracker is recovering from a
temporary obstruction).

The implemented two-tier strategy resolves this ambiguity by exploiting temporal char-
acteristics. Noise spikes are transient—they last for one or a few samples before reverting
to normal levels. Legitimate position changes are sustained—if the sun truly moved by 20°,
subsequent measurements will be consistent with the new position. The filter adapts as
follows:

Tier 1: Soft Scaling (Moderate Deviations): The filter maintains a running average
of innovation magnitude (innov_avg), updated with a very slow exponential moving average
(α = 0.015625) that reflects the baseline noise level. For each new measurement, the excess
innovation is computed: excess = max(0, |innovation| − innov_avg). This excess is scaled
by a factor of 35/256 ≈ 1/7.3 and then squared to produce a quadratic scaling factor: scale =
(excess/7.3)2. The effective measurement noise becomes Reff = Rbaseline × (1 + scale).

This quadratic relationship provides smooth, progressive adaptation. A small excess (0.5°
above baseline) produces minimal scaling (1.1×), allowing the filter to track gentle changes.


CHAPTER 3. METHODOLOGY 29

A moderate excess (2° above baseline) produces significant scaling (15×), causing the filter to
distrust the measurement and rely more on prediction. The quadratic form ensures that dou-
bling the excess increases the penalty by 4×, providing aggressive rejection of larger deviations
while maintaining sensitivity to small signals.

Tier 2: Hard Rejection (Extreme Spikes): When the excess innovation exceeds 3.75°,
indicating a severe outlier, the quadratic scaling is abandoned in favor of a fixed maximum scale
factor of 640×. This effectively forces the Kalman gain to nearly zero (K ≈ P/(P +640R) ≈ 0
when R is large), causing the filter to ignore the measurement entirely and coast on its
prediction. This hard threshold prevents catastrophic filter corruption from extreme outliers
(such as a 40° spike caused by a bird shadow or a reflection from a passing vehicle).

The 3.75° threshold was selected based on empirical testing with realistic noise profiles. It
is high enough to avoid false triggers during normal operation but low enough to catch genuine
outliers before they significantly influence the state estimate. The 640× scaling factor similarly
balances complete rejection (desirable for true outliers) against eventual recovery (necessary
if the "outlier" is actually a legitimate position change that persists).

Critically, the baseline innovation average (innov_avg) is updated only when the current
innovation is not itself an outlier (specifically, when |innovation| < 1.5 × innov_avg).
This prevents the filter from "learning" that large spikes are normal, which would defeat the
adaptation mechanism.

This adaptive strategy achieves approximately 70% noise reduction across diverse test
scenarios (detailed in Section 3.5) while maintaining sub-5-sample response times to legitimate
sudden changes. The combination of smooth quadratic scaling and hard thresholding provides
robust performance across the full spectrum of operating conditions, from ideal clear-sky
tracking to challenging partly-cloudy environments with frequent transient disturbances.

3.4 System Integration and Co-Design
3.4.1 Hardware-Software Interface (HPS-FPGA)
The DE1-SoC platform’s integration of FPGA fabric and Hard Processor System (HPS) en-
ables sophisticated hardware-software co-design. In this project, the HPS serves primarily
a supervisory and monitoring role, reading filtered angle values from the FPGA fabric via
memory-mapped Parallel I/O (PIO) ports.

Four 16-bit PIO cores are instantiated in the Platform Designer (Qsys) system: az_raw,
az_filtered, el_raw, and el_filtered. Each PIO is configured as an input port from the
HPS perspective, meaning the FPGA fabric writes angle values to the PIO’s data register, and
the HPS reads these values via Avalon-MM transactions. The Platform Designer tool assigns
each PIO a unique base address in the HPS’s memory map.

3.4.2 Inter-Processor Communication Protocols
The heterogeneous architecture relies on two serial communication links to coordinate the
Arduino, FPGA, and ESP32 subsystems. Both links use standard UART protocol but operate
at different baud rates and serve distinct purposes.

Arduino-FPGA Communication (115200 baud)

The Arduino Mega communicates with the FPGA via a dedicated hardware UART (Serial3,
using pins TX3/RX3). This link operates at 115200 baud—the highest rate reliably supported
by the Arduino’s hardware UART without requiring excessive CPU overhead for bit timing.


CHAPTER 3. METHODOLOGY 30

The high baud rate is necessary to maintain low latency in the filtering loop: at 115200 baud,
a four-byte packet requires approximately 350 microseconds to transmit, which is acceptable
for the 100 ms control loop cycle.

The communication protocol follows a simple request-response pattern within each control
loop iteration. Every 100 ms, the Arduino firmware executes the following sequence:

1. Read four LDR sensors and compute differential errors

2. Execute PID control laws to generate raw azimuth and elevation angles

3. Convert raw angles from floating-point degrees to Q9.7 fixed-point format (multiply by
128)

4. Transmit four-byte packet to FPGA: [AZIMUTH_HIGH][AZIMUTH_LOW][ELEVATION_HIGH][ELEVATION_LOW]

5. Wait for FPGA response (with 100 ms timeout)

6. Receive four-byte filtered angle packet in same format

7. Convert filtered angles from Q9.7 back to integer degrees

8. Command servo motors with filtered angle values

The entire round-trip—Arduino transmission, FPGA filtering (including Kalman filter com-
putation in hardware), and FPGA transmission back to Arduino—completes in approximately
2 milliseconds. This latency is dominated by UART serialization time ( 700 microseconds
total for bidirectional transfer) rather than filter computation time, which executes in hard-
ware in less than 500 nanoseconds. The remaining 98 milliseconds of the control loop cycle
are available for sensor reading, local computations, and ESP32 communication, ensuring the
system never blocks waiting for FPGA responses.

The Arduino firmware includes robust timeout handling: if a response is not received
within 100 milliseconds, the firmware assumes the FPGA has stalled and proceeds using the
raw (unfiltered) angle values as a fallback. This mechanism ensures that a failure in the FPGA
subsystem cannot deadlock the entire tracking system. In practice, timeouts are extremely
rare—occurring only during initial system power-up before the FPGA completes configuration
or in the event of a hardware fault.

Arduino-ESP32 Communication (9600 baud)

The second UART link connects the Arduino to the ESP32 microcontroller via Serial1 (pins
TX1/RX1) operating at 9600 baud. This lower rate is acceptable because the link carries
human-readable telemetry data for display on the web dashboard, not time-critical control
signals. The 9600 baud rate also ensures compatibility with a wide range of ESP32 modules
and simplifies debugging, as the ASCII-formatted messages can be monitored using standard
serial terminal software.

The Arduino transmits several types of messages to the ESP32:
Sensor Data: Periodic updates (every 300 ms) containing LDR readings (raw and fil-

tered), voltage, current, and power measurements. Format example:

DATA:512,508,495,510,510,506,493,508,6.45,0.11,0.71

Servo Positions: Updates whenever servo positions change, formatted as:

SERVO_POS:90,45


CHAPTER 3. METHODOLOGY 31

Angle Data: Real-time raw and filtered angle estimates from the Kalman filter, including
FPGA connection status:

ANGLE:90.25,90.12,45.75,45.80,OK

Statistics: System performance metrics updated every 5 seconds, including movement
counts, average LDR values, peak light intensity, and tracking lock duration.

The ESP32 receives these messages, parses them, and broadcasts the data to connected
web dashboard clients via WebSocket. The ESP32 also accepts commands from the dashboard
(such as mode switching or manual servo positioning) and forwards them to the Arduino
over the same UART link. This bidirectional communication enables full remote control and
monitoring of the system.

The key design choice here is the separation of concerns: the Arduino-FPGA link handles
time-critical filtering with minimal latency, while the Arduino-ESP32 link handles human-
interface tasks where latency of hundreds of milliseconds is acceptable. This separation ensures
that dashboard updates or WiFi connectivity issues cannot disrupt the real-time tracking
control loop.

3.5 Validation and Testing Methodology
3.5.1 Unit-Level Verification
FPGA Filter Simulation

The Kalman filter RTL design was verified through comprehensive functional simulation using
Synopsys VCS, a leading commercial simulation tool widely used in the semiconductor industry.
The simulation environment consists of a SystemVerilog testbench (tb_kalman_comprehensive.sv)
that instantiates the filter design under test (DUT), generates realistic test stimuli, monitors
outputs, and computes error metrics.

The testbench implements seven distinct test scenarios, each designed to stress different
aspects of the filter’s performance:

• Incline Normal Noise: Gradual 0° to 180° ramp with 2° Gaussian noise

• Decline Normal Noise: Gradual 180° to 0° ramp with 2° Gaussian noise

• Incline with Spikes: Gradual incline with sharp transient spikes (20-45°) injected at
multiple points

• Decline with Spikes: Gradual decline with sharp transient spikes

• Incline Very Noisy: Gradual incline with 4° noise plus random large spikes (3% occur-
rence rate)

• Decline Very Noisy: Gradual decline with 4° noise plus random large spikes

• Sudden Changes: Legitimate sudden position changes (simulating manual reposition-
ing or rapid cloud tracking adjustments) interspersed with gradual tracking segments

Each scenario processes 5000 samples, corresponding to approximately 8 minutes of real-
time operation at 10 Hz sampling. The test data is generated using a Linear Feedback Shift
Register (LFSR) for reproducible pseudo-random noise. For the spike scenarios, deterministic
large-magnitude outliers are injected at specific sample indices to test the adaptive rejection


CHAPTER 3. METHODOLOGY 32

logic. For the sudden change scenario, the ideal trajectory includes six abrupt jumps (20-60°)
to verify that the filter correctly tracks legitimate changes rather than rejecting them as noise.

The testbench drives the filter with Q9.7 fixed-point angle values via the angle_data_valid
handshake interface, mimicking the actual Avalon bridge behavior. It captures the filtered out-
put for each sample and computes the error relative to the ideal (noise-free) trajectory. Key
metrics accumulated include Root Mean Square Error (RMSE), maximum error, and conver-
gence statistics. Results for each scenario are written to separate CSV files for post-processing
and visualization in Python.

The VCS simulator provides cycle-accurate timing, allowing verification that the filter
completes processing within the expected latency (25 clock cycles from input valid to output
valid). The simulation also validates corner cases: filter initialization with the first mea-
surement, handling of boundary values (0° and 180°), and correct saturation behavior when
intermediate calculations would exceed the Q9.7 representable range.

Hardware-in-the-Loop Validation with Python Analysis

While RTL simulation verifies functional correctness, it operates on idealized mathematical
models of the signals. To validate the filter’s performance with realistic noise characteristics,
the simulation results are analyzed using a Python-based data analysis framework implemented
in a Jupyter notebook (kalman_comprehensive_analysis.ipynb).

The analysis workflow begins by loading the CSV files generated by the VCS testbench.
Each file contains columns for sample index, ideal angle, raw (noisy) angle, filtered azimuth,
filtered elevation, tracking error, and convergence status. The Python code computes several
key performance indicators:

RMSE Comparison: For each scenario, the RMSE of the raw input is compared against
the RMSE of the filtered output. The noise reduction percentage, reduction = (1 −
RMSEfiltered/RMSEraw ) × 100%, quantifies the filter’s effectiveness. Target performance
is 50-70% noise reduction across all scenarios.

Error Distribution Analysis: Histograms of the tracking error (filtered output minus
ideal) reveal the filter’s statistical behavior. Gaussian-distributed errors centered at zero indi-
cate unbiased, optimal filtering. Heavy tails or skewed distributions would suggest systematic
bias or inadequate noise rejection.

Spike Rejection vs. Tracking Trade-off: For scenarios with spikes, zoomed plots around
injected outliers show whether the filter successfully rejects transients (filtered output remain-
ing close to ideal despite raw input spike) or incorrectly tracks them. For sudden change
scenarios, similar plots verify that legitimate jumps are followed within a few samples rather
than rejected.

Settling Time Analysis: After each sudden change in the sudden change scenario, the
notebook computes how many samples are required for the filtered output to settle within
2° of the new ideal position. This settling time characterizes the filter’s responsiveness to
legitimate changes.

The visualizations generated—including multi-panel overview plots, error distributions,
and detailed zoom windows—provide intuitive insight into filter performance across operating
conditions. The analysis confirms that the implemented adaptive strategy achieves the target
70% noise reduction in normal conditions while successfully rejecting over 90% of injected
spikes and settling to new positions within 3-5 samples after legitimate changes.


CHAPTER 3. METHODOLOGY 33

3.5.2 Integration Testing
Full Pipeline Testing

The full pipeline test checks the entire system, from LDR sensor input on the Arduino to
angle display on the ESP32 dashboard. We tested both modes: filter bypass (raw data) and
filter enabled (filtered data). By observing the ESP32 dashboard, we saw that with the filter
enabled, the angle data was much smoother and servo movements were less jittery. In bypass
mode, the data was noisy and the servos moved more frequently. This clear difference confirms
that the Kalman filter effectively reduces noise and improves tracking performance.


Chapter 4

Results

4.1 Hardware Implementation and Synthesis Results
The FPGA implementation was synthesized using Intel Quartus Prime 18.1 targeting the
Cyclone V SoC (5CSEMA5F31C6) on the DE1-SoC board.

4.1.1 Resource Utilization and Performance

Figure 4.1: FPGA resource utilization summary

Table 4.1 summarizes the resource utilization. The complete system consumes 14% of logic
elements, 35% of RAM blocks, 28% of memory bits, and 18% of DSP blocks. This leaves
substantial room for future enhancements such as extended Kalman filter variants or additional
tracking axes.

34


CHAPTER 4. RESULTS 35

Table 4.1: FPGA resource utilization

Resource Used Available Utilization
Logic Elements 4,490 32,070 14%
RAM Blocks 30 87 35%
Memory Bits 1,246,208 4,450,000 28%
DSP Blocks 16 87 18%

The achieved maximum frequency (Fmax) is 52.62 MHz, exceeding the 50 MHz system
clock by 5.24%. The synthesis report confirms zero timing violations. The design achieves a
setup time slack of +0.997 ns and hold time slack of +0.224 ns, indicating that all timing
paths meet their requirements with comfortable margins even for the most critical paths.

4.1.2 RTL Synthesis Hierarchy

Figure 4.2: Top-level synthesis hierarchy


CHAPTER 4. RESULTS 36

Figure 4.3: Kalman filter top module synthesis showing dual independent filter
cores

Figures 4.2 and 4.3 show the clean, modular synthesis hierarchy. The design preserves func-
tional boundaries, enabling independent modification of subsystems. Each kalman_scalar
instance synthesizes to approximately 2,200 logic elements with dedicated arithmetic units and
FSM control logic. This modular approach proved valuable during development—when the
adaptive R scaling was enhanced, only the Kalman module required changes without affecting
the bridge, UART, or system integration.

4.2 Kalman Filter Algorithm Performance
The adaptive scalar Kalman filter was evaluated through comprehensive simulation using seven
test scenarios with 5,000 samples each (approximately 8 minutes at 10 Hz). Simulations used
Synopsys VCS with a SystemVerilog testbench generating realistic trajectories with calibrated
noise and spikes.

4.2.1 Baseline Performance

Figure 4.4: Tracking performance for trajectory with raw vs filtered angles


CHAPTER 4. RESULTS 37

Figure 4.5: Normal trajectory tracking with 2° Gaussian noise

For normal conditions (2° Gaussian noise), the filter achieved 66.7% noise reduction for incline
(RMSE: 1.16° → 0.39°) and 67.3% for decline (RMSE: 1.15° → 0.38°). The symmetry
confirms no directional bias.

4.2.2 Robustness Under Severe Noise

Figure 4.6: Performance with 4° noise plus random large spikes (3% occurrence
rate)

Under severe conditions (4° noise + random spikes), the filter achieved 70.0% reduction for
incline and 73.5% for decline, exceeding the 70% target despite maximum raw errors of 6.7°.

4.2.3 Spike Rejection

Figure 4.7: Trajectory tracking with deliberate sharp spikes (20-45°)


CHAPTER 4. RESULTS 38

Figure 4.8: Close-up showing spike attenuation

The filter achieved 51.2% reduction for incline with spikes and 57.9% for decline. Even the
largest 45° spike was attenuated to less than 18° deviation, lasting only 2-3 samples before
recovery.

4.2.4 Sudden Change Response

Figure 4.9: Response to legitimate sudden position changes (six abrupt jumps
20-60°)


CHAPTER 4. RESULTS 39

Figure 4.10: Close-up showing convergence after sudden change without overshoot

The filter correctly distinguishes between transient spikes and sustained changes. After legiti-
mate jumps, the filter converges within 3-5 samples without overshoot, demonstrating proper
adaptation.

4.2.5 Performance Summary

Figure 4.11: Error distribution histograms for all scenarios


CHAPTER 4. RESULTS 40

Figure 4.12: RMSE comparison and noise reduction across all scenarios

Table 4.2: Kalman filter performance summary

Scenario Raw RMSE (°) Filt RMSE (°) Reduction (%)
Normal Incline 1.16 0.39 66.7
Normal Decline 1.15 0.38 67.3
Very Noisy Incline 2.33 0.70 70.0
Very Noisy Decline 2.32 0.61 73.5
Incline with Spikes 2.17 1.06 51.2
Decline with Spikes 2.18 0.92 57.9

Five of six scenarios achieved or exceeded the 50-70% reduction target. The adaptive Q9.7
fixed-point implementation introduces no observable numerical artifacts.


CHAPTER 4. RESULTS 41

4.3 Integrated System Demonstration
4.3.1 Real-Time Dashboard

Figure 4.13: ESP32 dashboard showing real-time raw vs filtered angles with FPGA
status


CHAPTER 4. RESULTS 42

Figure 4.14: Dashboard displaying LDR readings, power monitoring, and system
statistics


CHAPTER 4. RESULTS 43

Figure 4.15: Manual control interface with slider-based servo positioning

The ESP32 web dashboard successfully displays real-time an