Row-wise vs. column-wise image processing

**peter koch** · Jan 25 '07, 12:15 PM

Re: Row-wise vs. column-wise image processing

On Jan 25, 11:59 am, Enrique Cruiz <jni6l03mdo6n.. .@jetable.org>
wrote:

Hello all,
>
I am currently implementing a fairly simple algorithm. It scans a
grayscale image, and computes a pixel's new value as a function of its
original value. Two passes are made, first horizontally and second
vertically. The problem I have is that the vertical pass is 3 to 4
times slower than the horizontal, although the code is _exactly_ the
same in both cases?!
>

[snip]
Modern CPU's are complex beasts doing lots of stuff to improve
performance. One such thing is caching of memory, and that is what
happens here: the horisontal pass can exploit the cache much better
than the vertical pass.

/Peter

**=?iso-8859-1?q?Erik_Wikstr=F6m?=** · Jan 25 '07, 12:15 PM

Re: Row-wise vs. column-wise image processing

Enrique Cruiz wrote:

This is not a C++ question as such, so it's kind of off topic, but
anyway...

// then for every column
for(col=firstCo l+1 ; col<=lastCol-1 ; ++col)
{
*pixel = (*pixel) * 2 / 3;
pixel++;
>
}

Normally one would use the loop-variable inside the loop, I suspect
that this is not the real code you are showing. In the real code there
might be something else that makes the difference, though I doubt it.

My only guess relates to memory management issues. Since the image is
stored row-wise, the current and next values are physically next in
memory during the horizontal pass. On the other hand, for the vertical
pass, the next value is stored in the next row, and the distance
between them becomes 'image_width'. My guess is that the next pixel
value in such a case is not close enough to be stored in the processor
cache or register. The processor has to fetch it from memory, hence the
massive loss in speed. This is however just a guess.

Probably yes, many factors come into play, the size of the image, the
size of the caches or the CPU, how much speculation the CPU does and so
on. A modern CPU often speculates that you don't only want to use the
piece of memory you try to access but also those pieces close to it.
This helps in the row-wise run since it allows the CPU to read in the
memory before it is actually used, in the column-wise run it works
against you by pulling in more than you need. If you have a smaller
image you might be able to squeeze it all into the cache which should
make both runs equally fast.

--
Erik Wikström

**Enrique Cruiz** · Jan 25 '07, 02:55 PM

Re: Row-wise vs. column-wise image processing

On 2007-01-25 12:02:42 +0000, "peter koch" <peter.koch.lar sen@gmail.comsa id:

>
Modern CPU's are complex beasts doing lots of stuff to improve
performance. One such thing is caching of memory, and that is what
happens here: the horisontal pass can exploit the cache much better
than the vertical pass.

Thanks. Do you know any resources that discuss similar issues related
to (agressive) optimization?

Alexis

**Enrique Cruiz** · Jan 25 '07, 02:55 PM

Re: Row-wise vs. column-wise image processing

On 2007-01-25 12:05:36 +0000, "Erik Wikström"
<eriwik@student .chalmers.sesai d:

Probably yes, many factors come into play, the size of the image, the
size of the caches or the CPU, how much speculation the CPU does and so
on. A modern CPU often speculates that you don't only want to use the
piece of memory you try to access but also those pieces close to it.
This helps in the row-wise run since it allows the CPU to read in the
memory before it is actually used, in the column-wise run it works
against you by pulling in more than you need. If you have a smaller
image you might be able to squeeze it all into the cache which should
make both runs equally fast.

Thanks a lot for the informative answer. Do you know good resources
that discuss similar issues related to optimisation?

Enrique

**Jim Langston** · Jan 25 '07, 03:25 PM

Re: Row-wise vs. column-wise image processing

"Enrique Cruiz" <jni6l03mdo6nvu u@jetable.orgwr ote in message
news:2007012514 482643658-jni6l03mdo6nvuu @jetableorg...

On 2007-01-25 12:05:36 +0000, "Erik Wikström" <eriwik@student .chalmers.se>
said:

>Probably yes, many factors come into play, the size of the image, the
>size of the caches or the CPU, how much speculation the CPU does and so
>on. A modern CPU often speculates that you don't only want to use the
>piece of memory you try to access but also those pieces close to it.
>This helps in the row-wise run since it allows the CPU to read in the
>memory before it is actually used, in the column-wise run it works
>against you by pulling in more than you need. If you have a smaller
>image you might be able to squeeze it all into the cache which should
>make both runs equally fast.

>
Thanks a lot for the informative answer. Do you know good resources that
discuss similar issues related to optimisation?
>
Enrique

Also, consider that
pixel++;
is less assembly than:
pixel+=imgWidth ;

I mean, pixel++; may be something as simple as:
INC [pixel]
(although it may be more like)
MOV AX,[pixel]
INC AX;
MOV [pixel],AX

Where pixel+=imgWidth ; would involve some memory swapping and an add.
Some I've seen are like:
MOV AX,[pixel];
MOV BX,[imgWidth];
ADD AX,BX
MOV [pixel],DX

or similar. without looking at the assembly I couldn't say. I learned
assembly back in the 80x86 days and I"m sure it's changed a bit since then.

Row-wise vs. column-wise image processing

Row-wise vs. column-wise image processing

Comment

Comment

Comment

Comment

Comment