Conference Room SAD
[thread display] [new arrival display] [word search] [past log] [管理用]

Subject Re: Parallel module fault on a 64 bit 4-core machine
Date: 2011/05/26(Thu) 12:47:36
ContributorAkio Morita

> Hello,
> I'm attemping to use the parallel module of SAD main trunk on a 64 bit 4-core SLES10 machine. However, executing the " test-scale.sad " resulted in a segmentation fault. The error info is as below:
>
> (* Care to activate CPUs on power managed system *)
> Map[func, l];
>
> BenchScale[]; Exit[];
> Library[Algorism/Parallel/Fork] from /home/duanz/SAD/share/Extension/Algorism/Parallel/Fork.n is loaded.
> ???General::abort: Aborted:
> BenchScale[]
> ^
> ???-FFS-Error-?Undefined command or element: BENCHSCALE[]
>
> ! End of File
> In[1]:= ^C
>
>
> I tried it on another 32bit 8-core SLC5.5 machine, and it did work properly. So I wonder if the former fault is related with 64-bit?
>
fork(2) based parallel algorism uses anonymous shared memory for inter-process communication channel.
This shared memory is created via mmap(2) system call and mmap system call returns `pointer'.
In SAD code, `pointer' is handled as index number of double-array and it is stored into signed integer.
Thus, SAD code CAN be handled 16GiB virtual memory space lower around rlist(1).
(The negative index number is defined as invalid.)

In the case that mmap(2) returns higher VM space, SAD shared memory code cause faital error.
(eg. segmentation fault)

Solution(Fix SAD codes)
* Rewrite memory handling code and internal data structure.

Workaround(Tuning/Modifing Operating System)
* Limit VM space up-to 16GiB
* Modify mmap(2) to return under 16GiB boundary


- 関連一覧ツリー (Click ▼ to display all articles in a thread.)