-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsobelFilter.html
More file actions
268 lines (246 loc) · 16.5 KB
/
sobelFilter.html
File metadata and controls
268 lines (246 loc) · 16.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
<!DOCTYPE HTML>
<html>
<head>
<title>MaxPatwardhan.github.io - Sobel Filtering Edge Detection Algorithm</title>
<link rel="icon" type="image/png" href="images/personal/icon.png">
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no" />
<link rel="stylesheet" href="assets/css/main.css" />
<noscript><link rel="stylesheet" href="assets/css/noscript.css" /></noscript>
<script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
<script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
</head>
<body class="is-preload">
<!-- Wrapper -->
<div id="wrapper">
<!-- Header -->
<header id="header">
<a href="index.html" class="logo">Home</a>
</header>
<!-- Nav -->
<nav id="nav">
<ul class="links">
<li><a href="index.html">Projects</a></li>
<li><a href="macro.html">Photography</a></li>
<li><a href="elements.html">About Me</a></li>
</ul>
<ul class="icons">
<li><a href="https://github.com/MaxPatwardhan" class="icon brands fa-github"><span class="label">GitHub</span></a></li>
<li><a href="https://www.linkedin.com/in/maxwell-patwardhan" class="icon brands alt fa-linkedin-in"><span class="label">LinkedIn</span></a></li>
</ul>
</nav>
<!-- Main -->
<div id="main">
<!-- Post -->
<section class="post">
<header class="major">
<h2>Sobel Edge Detection Algorithm</h2>
<h3>Image Filtering for Robot Localization and Mapping</h3>
<p>Personal Project; December 2022</p>
</header>
<span class="image fit"><img src="images/sobel/montage.jpg" alt="Sobel Edge Detection Output" /></span>
<!-- Background -->
<section id="background">
<h2>Background</h2>
<p>
Modern autonomous robots rely heavily on imaging in order to navigate their environments. Imaging technology comes
in a variety of styles ranging from Lidar to standard cameras that operate within the visible spectrum. Cameras provide
the robot with necessary information for path planning and localization. Yet, there is an inherent problem with imaging
technology: it is too information-dense. What does this mean? A typical camera provides a surfeit of data every second.
It is not possible for modern day computers to handle such a volume of information at a rate that is useful for real-time
autonomous decision making. Thus, the data must be filtered. Processing and filtering image data allows the controller
of the robot to parse necessary information in order to make decisions. The speed, accuracy, and efficiency of this filtering
is key to the performance of an autonomous agent. For this project I have built a filter that takes an image input and
returns the edges of the objects within it, commonly known as a Gaussian Filter with the Sobel Operator.
</p>
</section>
<!-- Importance of Filtering -->
<section id="importance">
<h2>The Importance of Filtering and Edge Detection</h2>
<p>Robotic systems equipped with cameras generate vast amounts of data. For example, a camera with a resolution of 1920×1080 operating at 30 frames per second produces
approximately:</p>
<p>$$
\underbrace{1920 \times 1080}_{\text{Resolution (W$\times$H)}} \times
\underbrace{3}_{\text{Color channels}}\times
\underbrace{30}_{\text{Frame rate}} =
1.87 \times 10^8 \, \text{bytes/second.}
$$
</p>
<p>
This data load is computationally prohibitive for real-time processing, especially when considering the small on-board computers typically used
in autonomous platforms. Filters reduce data volume while retaining information critical to robot tasks like path planning
and object detection. This project implements convolution-based filters, which efficiently process image data using numerical convolution in
<b>O(W × H)</b> computational time.
</p>
</section>
<!-- Convolution-Based Filtering -->
<section id="convolution">
<h2>Convolution-Based Filtering</h2>
<p>Convolution is the mathematical foundation of image filtering. The continuous convolution operation is defined as:</p>
<p>
$$ (f * g)(x) = \int_{-\infty}^\infty f(\tau)g(x - \tau) \, d\tau. $$
</p>
<p>For images, we use a discrete form:</p>
<p>
$$ (f * g)[x, y] = \sum_{i=-n}^n \sum_{j=-m}^m f[x - i, y - j]g[i, j] \textrm{, where } n \textrm{, }m \textrm{ is the size of }f[x,y], $$
</p>
<p>where \( f(x,y) \) is the image and \( g(i,j) \) is the filter kernel.</p>
</section>
<!-- Gaussian Denoising Filter -->
<section id="gaussian">
<h3>Gaussian Denoising Filter</h3>
<span class="image fit"><img src="images/sobel/WalliserInpt.jpg" alt="Sobel Edge Detection Output" /></span>
<p>
Let's walk through processing the above image. While it is certainly a lovely photo, there is all sorts of extraneous data that our robot doesn't need to process.
First, we will process it with a Gaussian filter, and secondly we will use a Sobel filter to extract the edges. <br>The Gaussian filter reduces noise by averaging pixel values based on a Gaussian distribution. Its kernel matrix can be represented as:</p>
<p>
$$ g[x, y] = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 1 \end{bmatrix}. $$
</p>
<p>The weight (\( \sigma \)) determines the influence of neighboring pixels. After applying this filter, the image is significantly smoothed, making subsequent edge detection more effective.</p>
<span class="image fit"><img src="images/sobel/Gaussfilt.jpg" alt="Sobel Edge Detection Output" /></span>
<p>
The diagram above shows a sample image (left) and a denoised image (right). To reduce data size, the color values at each place were
replaced with a brightness value, making the image black-and-white. The right image should look slightly blurred compared to the original, which is the desired effect of the denoising.
</p>
</section>
<!-- Sobel Edge Detection -->
<section id="sobel">
<h3>The Sobel Operator</h3>
<p>
Next, we will convolve this image with the Sobel Filter, in order to reduce the image to its edges and pertinent
information. The Sobel filter consists of two basis kernels \(s_x \) and \(s_y \), each of which are convolved with our image, \(f(x,y)\)
and then summed in order to generate the filtered image. The kernels essentially take a directional gradient along the \(x \) and \(y \) axes at each point of
the matrix, and thus each kernel operates in only one direction. To retrieve the final construction, the magnitude of the
gradient at each pixel must be taken. The kernels take the form:
</p>
<p>
$$ s_x = \begin{bmatrix} 1 & 0 & -1 \\ 2 & 0 & -2 \\ 1 & 0 & -1 \end{bmatrix}, \quad
s_y = \begin{bmatrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{bmatrix}. $$
</p>
<p>
Now, we can convolve our image, \( f(x,y)\), with the kernels and compute the magnitude of the directional components of the gradient to illustrate the edge density:
</p>
<p>
$$ G_x =
\begin{bmatrix}
1 & 0 & -1 \\
2 & 0 & -2 \\
1 & 0 & -1
\end{bmatrix}
\ast f[x,y], \quad G_y =
\begin{bmatrix}
1 & 2 & 1 \\
0 & 0 & 0 \\
-1 & -2 & -1
\end{bmatrix}
\ast f[x,y]. $$
</p>
<p>
Lastly, with the uni-directional gradient values calculated, the magnitude can be calculated:
</p>
<p>
$$ G[a,b] = \sqrt{G_x[a,b]^2 + G_y[a,b]^2} \quad \forall \left | a \right |,\left | b \right | \leq m,n. $$
</p>
<p>The result is an edge-detected image, retaining essential features for robot path planning while discarding redundant data.</p>
<div style="text-align: center;">
<img src="images/sobel/filt.jpg" alt="" style="display: block; margin: 0 auto; max-width: 80%;" />
</div>
<p>
<br>Note that there is still a considerable amount of noise in this image. The simplest method to do away with this unwanted data is to
delete any pixel value beneath a certain threshold, leaving the desired edges extracted from the image:<br>
</p>
<div style="text-align: center;">
<img src="images/sobel/edge.jpg" alt="" style="display: block; margin: 0 auto; max-width: 80%;" />
</div>
<p><br></p>
<h3>Importance of the Gaussian Filter</h3>
<p>
It is important to note the importance of the Gaussian filter in this process. Without it, the filtered image would lack much of the necessary noise reduction.
Below is an image convolved with the Sobel filters without prior denoising. Note the large amount of leftover noise that passed the threshold test:<br>
</p>
<div style="text-align: center;">
<img src="images/sobel/WalliserEdge.jpg" alt="" style="display: block; margin: 0 auto; max-width: 80%;" />
</div>
<p><br></p>
</section>
<!-- Analysis and Conclusion -->
<section id="conclusion">
<h2>Analysis and Conclusion</h2>
<p>
The processed image is an order of magnitude smaller than the original, reducing the computational expense of localization and decision-making. Without preprocessing,
robots cannot handle the raw data load effectively, resulting in delayed or inaccurate actions. In a short set of steps, we have managed to perform a series of calculations that
actually make localization a feasible real-time task. The figure below compares the frequency and intensity of pixel values ranging from 0 to 255 on both the original
image and edge-detected image:
</p>
<figure class="image fit">
<img src="images/sobel/histcompare.JPG" alt="Sobel Edge Detection Output" />
<figcaption>(Disregard the difference in magnitude of the axes, it is due to the large number of null values present in the filtered image)</figcaption>
</figure>
<p>The table below shows the difference in size between the original and processed images:</p>
<table style="border-collapse: collapse; width: 100%;">
<thead>
<tr>
<th>Stage</th>
<th>Size (Bytes)</th>
<th>Reduction from Original</th>
</tr>
</thead>
<tbody>
<tr>
<td>Original RGB Image</td>
<td>6,220,800 (6.22 Mb)</td>
<td>0%</td>
</tr>
<tr>
<td>Grayscale Image</td>
<td>2,073,600 (2.07 Mb)</td>
<td>~67%</td>
</tr>
<tr>
<td>Edge Map</td>
<td>259,200 (259 Kb)</td>
<td>~96%</td>
</tr>
</tbody>
</table>
<p>
The processed image is a handful of kilobytes, compared to the megabytes of the original image. Its pixel-matrix
representation is full of zeros, which makes computational operations necessary in localization far less expensive than they
otherwise would be. Though these data filtering and image processing techniques are not at the current forefront of the
field, they still do a good job in representing how powerful and necessary image processing is. Without quick access to
good sensor information, a robot cannot make good decisions in real time, and a robot that cannot make good decisions is not a very
good robot in the first place.
</p>
</section>
</section>
<h2>Original Paper</h2>
<iframe src="images/sobel/Mechatronics_Graduate_Report.pdf" width="100%" height="800px" title="Robotic Metamaterials Thesis" style="margin-bottom: 50px;"></iframe>
</div>
<!-- Footer -->
<footer id="footer">
<section>
<div style="display: flex; align-items: center; gap: 10px;">
<h3 style="margin: 0;">Email:</h3>
<p style="margin: 0;"><a href="mailto:maxwellpatwardhan@gmail.com">maxwellpatwardhan@gmail.com</a></p>
</div>
<div style="display: flex; align-items: center; gap: 10px;">
<h3 style="margin: 0;">Other Pages:</h3>
<ul class="icons alt" style="display: flex; gap: 10px; list-style: none; padding: 0; margin: 0;">
<li><a href="https://github.com/MaxPatwardhan" class="icon brands alt fa-github"><span class="label">GitHub</span></a></li>
<li><a href="https://www.linkedin.com/in/maxwell-patwardhan" class="icon brands alt fa-linkedin-in"><span class="label">LinkedIn</span></a></li>
</ul>
</div>
</section>
</footer>
</div>
<!-- Scripts -->
<script src="assets/js/jquery.min.js"></script>
<script src="assets/js/jquery.scrollex.min.js"></script>
<script src="assets/js/jquery.scrolly.min.js"></script>
<script src="assets/js/browser.min.js"></script>
<script src="assets/js/breakpoints.min.js"></script>
<script src="assets/js/util.js"></script>
<script src="assets/js/main.js"></script>
<script src="assets/js/slideshow.js"></script>
</body>
</html>