View Full Version : StrongARM-based Intrinsyc Cerfboard and Compact Flash craziness
john.orlando@gmail.com 01-11-2005, 04:04 AM Hello,
We have been working a project that utilizes a StrongARM-based
Cerfboard from Intrinsyc (SA1110 version of the StrongARM). For those
of you not familiar with it, the Cerfboard is a small embedded module
that runs Linux (specifically, a distro called I-Linux that Intrinsyc
provides). This board has Compact Flash, 10/100 ethernet, and a few
other goodies. Anyway...
Recently, we have started using the Compact Flash slot that is
available on the board. We noticed that at times, the board would
completely freeze when attempting to go through the cardctl
insert/mount procedure to gain access to the Compact Flash. We wrote a
script that would do the following:
Forever
cardctl insert
mount /mnt/cf
cp file to CF
rm file from CF
umount /mnt/cf
cardctl eject
print iteration count
We then let this run forever, watching the output on our serial
console. Sometimes, the board will crash after anything between 0 and
40 iterations (random). There is no response from any i/o device
(serial console, ethernet, etc). However, upon scoping the SDRAM
lines, there is still activity here, albeit much less than before the
crash occurs (by about an order of magnitude).
Sometimes, the board runs forever just fine...
....HERE IS WHERE THE CRAZINESS PART STARTS
One of our technicians noticed that if the script is started in the
FIRST nine minutes after power-up, the script will fail anywhere
between 0 and 40 iterations. If we start the script AFTER nine minutes
(its not PRECISELY nine minutes, but close), it NEVER EVER FAILS! We
have had it run thousands of times without any problems, as long as we
wait nine minutes before starting.
Here are the specifics of our setup:
Kernel: 2.4.9-ac10-rmk2-np1-cerf2
Card Services: 3.1.22
CPU: SA1110 rev 8
Distro: I-Linux 4.4
Execution speed: 192 MHz (through PLL)
There are no applications running on this board, other than the script.
>From a hardware standpoint, we do notice that after the nine minute
mark the address/data bus in general appear much less active. Seems as
though prior to nine minutes, there are address/data accesses on the
SA1110 busses about once per 1.5 uS or so, in burst-chunks of about 14
at a time, with about 6 uS between burst chunks. After the nine minute
mark, the burst pace is about once every 14 uS, still in groups of 14
at a time, with a time-betwee-burst chunk also increased linearly.
Has anyone ever seen anything like this? Any ideas what magic happens
at the nine minute mark? We have been eliminating hardware slowly, and
are starting to think it is kernel/driver related. As a test, we
grabbed an XScale-based Cerfboard running 2.4.18 kernel, and same Card
Services. We ran the script on this board and it doesn't exhibit the
problem (though the hardware is obviously also different).
If you've read this far, much appreciated. We've been pulling our hair
out regarding this one. I don't think Intrinsyic still supports this
board, but we're following up with them just in case.
Thanks in advance for any help that can be provided.
Balding,
John O.
Marco Cavallini [KOAN] 01-11-2005, 09:26 PM > We have been working a project that utilizes a StrongARM-based
> Cerfboard from Intrinsyc (SA1110 version of the StrongARM). For those
> of you not familiar with it, the Cerfboard is a small embedded module
> that runs Linux (specifically, a distro called I-Linux that Intrinsyc
> provides). This board has Compact Flash, 10/100 ethernet, and a few
> other goodies. Anyway...
>
> Recently, we have started using the Compact Flash slot that is
> available on the board. We noticed that at times, the board would
> completely freeze when attempting to go through the cardctl
> insert/mount procedure to gain access to the Compact Flash. We wrote a
> script that would do the following:
>
> Forever
> cardctl insert
> mount /mnt/cf
> cp file to CF
> rm file from CF
> umount /mnt/cf
> cardctl eject
> print iteration count
>
> We then let this run forever, watching the output on our serial
> console. Sometimes, the board will crash after anything between 0 and
> 40 iterations (random). There is no response from any i/o device
> (serial console, ethernet, etc). However, upon scoping the SDRAM
> lines, there is still activity here, albeit much less than before the
> crash occurs (by about an order of magnitude).
>
> Sometimes, the board runs forever just fine...
>
> ...HERE IS WHERE THE CRAZINESS PART STARTS
>
> One of our technicians noticed that if the script is started in the
> FIRST nine minutes after power-up, the script will fail anywhere
> between 0 and 40 iterations. If we start the script AFTER nine minutes
> (its not PRECISELY nine minutes, but close), it NEVER EVER FAILS! We
> have had it run thousands of times without any problems, as long as we
> wait nine minutes before starting.
>
> Here are the specifics of our setup:
>
> Kernel: 2.4.9-ac10-rmk2-np1-cerf2
> Card Services: 3.1.22
> CPU: SA1110 rev 8
> Distro: I-Linux 4.4
> Execution speed: 192 MHz (through PLL)
>
> There are no applications running on this board, other than the script.
>
>>From a hardware standpoint, we do notice that after the nine minute
> mark the address/data bus in general appear much less active. Seems as
> though prior to nine minutes, there are address/data accesses on the
> SA1110 busses about once per 1.5 uS or so, in burst-chunks of about 14
> at a time, with about 6 uS between burst chunks. After the nine minute
> mark, the burst pace is about once every 14 uS, still in groups of 14
> at a time, with a time-betwee-burst chunk also increased linearly.
>
> Has anyone ever seen anything like this? Any ideas what magic happens
> at the nine minute mark? We have been eliminating hardware slowly, and
> are starting to think it is kernel/driver related. As a test, we
> grabbed an XScale-based Cerfboard running 2.4.18 kernel, and same Card
> Services. We ran the script on this board and it doesn't exhibit the
> problem (though the hardware is obviously also different).
>
> If you've read this far, much appreciated. We've been pulling our hair
> out regarding this one. I don't think Intrinsyic still supports this
> board, but we're following up with them just in case.
We worked on a CerfBoard with SA1110 ;-)
You have to switch to a more recent 2.4 kernel,
I suggest you at least linux-2.4.27.
Best regards
--
Marco Cavallini
================================================== ===
Koan s.a.s. - Software Engineering
Linux and WinCE solutions for Embedded and Real-Time Software
Klinux : the embedded distribution for industrial applications
- Atmel AT91 ARM Third Party Consultant
- Intel PCA Developer Network Member
- Microsoft Windows Embedded Partner
Via Pascoli, 3 - 24121 Bergamo - ITALIA
Tel. +39-(0)35-255.235 - Fax +39-178-223.9748
http://www.koansoftware.com - http://www.klinux.org
================================================== ===
john.orlando@gmail.com 02-11-2005, 03:16 AM > We worked on a CerfBoard with SA1110 ;-)
> You have to switch to a more recent 2.4 kernel,
> I suggest you at least linux-2.4.27.
Hi Marco,
Thanks for the info...the 2.4.9 kernel was the last one that was
"officially" provided by Intrinsyc, which is why we were trying to
stick with it. We did get a 2.4.18-based kernel up and running on our
SA1110 board yesterday after tweaking a few things. However, I don't
believe that this fixed our issues. We can try to get to the 2.4.27
kernel...out of curiousity, how did you pick this kernel, other than
the fact that it is near the end of the 2.4.x series? Is there a
specific fix in 2.4.27 that you are aware of that fixed an issue you
had?
Also, out of curiousity, since Intrinsyc apparently doesn't support
anything "officially" past 2.4.18 (and even that is only for
XScale-based boards), how did you go about building and patching your
kernel to get it to work? Did you just go to www.arm.linux.org.uk and
grab the appropriate kernel and patch it against Intrinsyc's updates?
Any help is appreciated here...thanks!
John O.
Marco Cavallini [KOAN] 02-11-2005, 04:51 AM >> We worked on a CerfBoard with SA1110 ;-)
>> You have to switch to a more recent 2.4 kernel,
>> I suggest you at least linux-2.4.27.
>
> Hi Marco,
> Thanks for the info...the 2.4.9 kernel was the last one that was
> "officially" provided by Intrinsyc, which is why we were trying to
> stick with it. We did get a 2.4.18-based kernel up and running on our
> SA1110 board yesterday after tweaking a few things. However, I don't
> believe that this fixed our issues.
Which rmk patch are you using ?
> We can try to get to the 2.4.27
> kernel...out of curiousity, how did you pick this kernel, other than
> the fact that it is near the end of the 2.4.x series? Is there a
> specific fix in 2.4.27 that you are aware of that fixed an issue you
> had?
I used 2.4.18 version years ago,
and it fixed some issues,
you can see release notes here
http://www.arm.linux.org.uk/developer/v2.4/
I sent a patch in linux-2.4.27-vrs1
but is for (Cypress USB host)
If I remember well I solved CF problems upgrading Card Services.
> Also, out of curiousity, since Intrinsyc apparently doesn't support
> anything "officially" past 2.4.18 (and even that is only for
> XScale-based boards), how did you go about building and patching your
> kernel to get it to work? Did you just go to www.arm.linux.org.uk and
> grab the appropriate kernel and patch it against Intrinsyc's updates?
Get vanilla kernel-2.4.27 and the patch linux-2.4.27-vrs1 from
www.arm.linux.org.uk
BTW
The most important thing is to switch to latest Card Services: 3.2.8
HTH
--
Marco Cavallini
================================================== ===
Koan s.a.s. - Software Engineering
Linux and WinCE solutions for Embedded and Real-Time Software
Klinux : the embedded distribution for industrial applications
- Atmel AT91 ARM Third Party Consultant
- Intel PCA Developer Network Member
- Microsoft Windows Embedded Partner
Via Pascoli, 3 - 24121 Bergamo - ITALIA
Tel. +39-(0)35-255.235 - Fax +39-178-223.9748
http://www.koansoftware.com - http://www.klinux.org
================================================== ===
john.orlando@gmail.com 04-11-2005, 07:13 AM > Which rmk patch are you using ?
>
Kernel: 2.4.9-ac10-rmk2-np1-cerf2
> Get vanilla kernel-2.4.27 and the patch linux-2.4.27-vrs1 from
> www.arm.linux.org.uk
We did this...so now we have a 2.4.27 kernel with the vrs1 patch in it.
We don't have the ethernet fully up and running yet, but the kernel
boots and brings us to a login. Alas, the problem still exists.
Problematic in the first nine minutes, perfect after that.
>
> BTW
> The most important thing is to switch to latest Card Services: 3.2.8
We upgraded our card services to 3.2.8, and this didn't seem to make
any difference.
So...we are still scratching our (almost bald) heads at this point.
Any other ideas out there regarding what could possible be going on
here?
Thanks,
John
john.orlando@gmail.com 01-12-2005, 04:31 PM Well...it looks like we solved our problem. If you haven't followed
the thread up till now, I'd suggest re-reading it from the beginning.
I'm just re-capping the solution we found.
We kept trying to figure out what could be causing a change in the
system at the 9 minute mark, and why we could use the Compact Flash in
our StrongArm-based system without problems after the 9-minute mark,
but with tons of problems before the 9-minute mark. We decided that
perhaps we could get more info regarding the current state of the
system if we could take a snapshot of all the registers in the system
once a minute, and see if anything changes around 9 minutes. There is
a module called registers.o that can be insmodd'ed which allows all the
registers in the SA1110 to appear as regular files. So, we wrote a
script to go through and once a minute dump the name of each register
and its contents to a file. We let the script run for 30 minutes, and
then examined the output log. Sure enough, at the 9-minute mark, there
were a few registers that went through a one-time modification, and a
few registers that had been changing every minute up till the 9-minute
mark, and then stopped changing. We started digging through the SA1110
data sheet to see which registers these were. It turned out the
majority of them had to do with a DMA channel that is allocated to
support the framebuffer for an external LCD display. We aren't using
any LCD display in our system, nor do we need the framebuffer. This
support was turned on in the stock config file provided by Intrinsyc
for the CerfBoard. So, we re-built the kernel without framebuffer
support, and sure enough, our Compact Flash test didn't fail regardless
of whether or not we started it in the first 9 minutes. We re-ran our
log-the-registers-to-files script as well, and the registers that had
previously been changing at the 9-minute mark were no longer changing.
So, we are guessing that there is something screwy with the
framebuffer/LCD driver that was somehow causing a conflict with our
Compact Flash card. The 9-minute thing is still a mystery to me
though. But as I said, we don't need framebuffer/LCD support, so out
of the kernel it goes.
We will now start the long, slow process of regrowing our hair :-)
Thanks for all who offered an idea...as always, much appreciated.
John O.
john.orlando@gmail.com wrote:
> > Which rmk patch are you using ?
> >
> Kernel: 2.4.9-ac10-rmk2-np1-cerf2
>
> > Get vanilla kernel-2.4.27 and the patch linux-2.4.27-vrs1 from
> > www.arm.linux.org.uk
>
> We did this...so now we have a 2.4.27 kernel with the vrs1 patch in it.
> We don't have the ethernet fully up and running yet, but the kernel
> boots and brings us to a login. Alas, the problem still exists.
> Problematic in the first nine minutes, perfect after that.
> >
> > BTW
> > The most important thing is to switch to latest Card Services: 3.2.8
>
> We upgraded our card services to 3.2.8, and this didn't seem to make
> any difference.
>
> So...we are still scratching our (almost bald) heads at this point.
> Any other ideas out there regarding what could possible be going on
> here?
>
> Thanks,
> John
|
|