Median filter and edge preservation

Denis_Brojan · September 22, 2017, 4:20pm

Hi James!

I am playing around with your core and profile modules trying to extract some profiles. I am interested in finding penumbra positions for single profiles. You already have created functions that do that, but my problem is this: if I increase the level of filtering, I get different results for the right-sided penumbras. I wonder if this is normal for this type of filtering. I have absolutely no knowledge of filtering, I have no idea how filtering works.

This is a simple script to demonstrate my dilemma:

`
import matplotlib.pyplot as plt
from pylinac import image
from pylinac.core import profile

img = image.load(“10x10 Field.dcm”)
img.invert()
img.crop(pixels=2)

p1 = profile.SingleProfile(img.array[450, :])
p2 = profile.SingleProfile(img.array[450, :])
p1.filter(size=0.1)
plt.plot(p1)
plt.plot(p2)

`

Can you give me any insight into this?

Best regards,
Denis

10x10 Field.dcm (2 MB)

jkerns · September 24, 2017, 7:31pm

Denis,
For your example script, I get very reasonable results between the raw and filtered profiles. If I increase the filtering to around 0.5 or more I get ridiculous results for the overall profile, which will hopefully make sense after reading below.

The filter method can take a float between 0 and 1 or an integer. As explained here, if the filter is a float between 0 and 1 it will filter based on a percentage of the entire profile (i.e. filter=0.2 will use a filter window of 20% of the entire profile length). This is useful when you don’t know the size of profile or want something consistent between differently sized profiles. If an integer is passed, that is the pixel window used (e.g. filter=5 will use 5 pixels no matter the profile size). You can also choose a median or gaussian filter. Using your script then, p1 is applying a median filter (default) of size 1020 x 0.1 = 102 pixels. The median value of the values from -51 to +51 values from the value in question is set to the new value. This is done for each value throughout the array. As you can imagine, using a very large value (e.g. 0.8 or 500) will smooth the data beyond the size of the field and give unreasonable results. Note also that the method is a fairly simple wrapper around scipy’s median and gaussian filters, so studying that may also clear things up.

Let me know how things turn out!