Monday, March 15, 2010

Linux下单机安装GT4


有个quickstart://www.globus.org/toolkit/docs/4.2/4.2.1/admin/quickstart/
这里主要引用了英文文档,主要详细写了认证的过程,可以对照看。

IA32 + Red Hat Linux AS4
ip:166.111.*.*/10.0.1.191
hostname=gridserver
3个普通用户: fitgrid globus postgre

一. 保证机器上有以下几个软件, 版本相近就可以:
gt4.0.5-x86_rhas_4-installer.tar.tar
jdk-1_5_0_08-linux-i586.bin
apache-ant-1.7.0-bin.zip
postgresql-8.2.5.tar.gz

装jdk1.5.0_08
1.下jdk-1_5_0_08-linux-i586.bin,加上x权限:
[root@gridserver java]# chmod 755 jdk-1_5_0_08-linux-i586.bin
[root@gridserver java]# ./jdk-1_5_0_08-linux-i586.bin
系统询问时,敲 y

2.设置环境变量
如果所有人都想用这个jdk,那就修改/etc/profile。
如果只想某个普通用户使用,就修改$HOME/.bash_profile,这里修改/etc/profile。

[root@gridserver local]# vi /etc/profile 末尾加上:
JAVA_HOME=/usr/java/jdk1.5.0_08
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:$PATH
export JAVA_HOME CLASSPATH PATH

3.测试
[root@gridserver local]# java -version
jdk1.4.2
这是 AS4 系统自带的一个 java,先看看在哪里,然后备份
[root@gridserver local]# which java
/usr/bin
[root@gridserver bin]# cd /usr/bin/ && mv java javabackup
[root@gridserver bin]# java -version
java version "1.5.0_08"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_08-b03)
Java HotSpot(TM) Client VM (build 1.5.0_08-b03, mixed mode)

确实是刚装完的jdk。测试一个小程序:
[root@gridserver local]# cat hello.java
public class hello {
public static void main(String args [ ])
{
System.out.println("Hello World!");
}
}

[root@gridserver local]#javac hello.java
[root@gridserver local]#java hello.java
Hello World!
安装成功!

装Apache-Ant-1.7.0
将apache-ant-1.7.0-bin.zip 放在/usr/local/ 下
[root@gridserver local]# unzip apache-ant-1.7.0-bin.zip
[root@gridserver local]# mv apache-ant-1.7.0 ant-1.7.0
[root@gridserver local]# vi /etc/profile
ANT_HOME=/usr/local/ant-1.7.0
PATH=$JAVA_HOME/bin:$ANT_HOME/bin:$PATH
[root@gridserver local]# source /etc/profile
[root@gridserver local]# ant -version
Apache Ant version 1.7.0 compiled on December 13 2006
配置成功!

二. 安装GT

1.增加一个普通用户 globus, GT要用globus用户安装,不能用root安装。
[root@gridserver local]# useradd globus
[root@gridserver local]# passwd globus

2.这里将GT装在/usr/local/globus-4.0.5/
[root@gridserver local]# mkdir globus-4.0.5
[root@gridserver local]# chown globus:globus /usr/local/globus-4.0.5

3.修改环境变量
[root@gridserver globus]# vi /etc/profile 末尾加上:
GLOBUS_LOCATION=/usr/local/globus-4.0.5
export JAVA_HOME ANT_HOME GLOBUS_LOCATION CLASSPATH PATH
[root@gridserver globus]# source /etc/profile

4. 将gt4.0.5-x86_rhas_4-installer.tar.tar 放在/home/globus/ 下
[root@gridserver globus]# chown globus:globus gt4.0.5-x86_rhas_4-installer.tar.tar
[root@gridserver globus]# su - globus
[globus@gridserver ~]$ tar zxvf gt4.0.5-x86_rhas_4-installer.tar.tar
[globus@gridserver ~]$ cd gt4.0.5-x86_rhas_4-installer
[globus@gridserver ~]$ ./configure --prefix=/usr/local/globus-4.0.5

Note:如果后面你要用 pbs 提交作业,并且pbs已经安装完毕,那就加个--enable-wsgram-pbs:
$ ./configure --prefix=/usr/local/globus-4.0.5 --enable-wsgram-pbs
同理可以加--enable-wsgram-condor --enable-wsgram-lsf

[globus@gridserver ~]$make && make install

5.Creating a CA (Certificate Authority)
[globus@gridserver ~]$ source $GLOBUS_LOCATION/etc/globus-user-env.sh
The command puts all the Globus libraries into my CLASSPATH

[globus@gridserver ~]$ $GLOBUS_LOCATION/setup/globus/setup-simple-ca
WARNING: GPT_LOCATION not set, assuming:
GPT_LOCATION=/usr/local/globus-4.0.5
C e r t i f i c a t e A u t h o r i t y S e t u p
This script will setup a Certificate Authority for signing Globususers certificates.
It will also generate a simple CA package that can be distributed to the users of the CA.
The CA information about the certificates it distributes will
be kept in:
/home/globus/.globus/simpleCA/
The unique subject name for this CA is:
cn=Globus Simple CA, ou=simpleCA-gridserver, ou=GlobusTest, o=Grid
Do you want to keep this as the CA subject (y/n) [y]:y
其中:cn表示“common name”是证书名;
ou表示“organizational unit”,是本SimpleCA服务器区别于其他服务器的标志;第二个ou表示机器名;o 表示
“organization”,本网格所属的组织名。输入y表示接受默认的设置,输入n表示自己设置相关信息。
Enter the email of the CA (this is the email where certificate requests will be sent
to be signed by the CA):gridinstall@163.com
The CA certificate has an expiration date. Keep in mind that once the CA certificate has expired,
all the certificates signed by that CA become invalid. A CA should regenerate the CA certificate
and start re-issuing ca-setup packages before the actual CA certificate expires. This can be done
by re-running this setup script. Enter the number of DAYS the CA certificate should last before
it expires.[default: 5 years (1825 days)]:
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
这个phrase要记住,后面还要用
creating CA config package...done.
A self-signed certificate has been generated for the Certificate Authority with the subject:
/O=Grid/OU=GlobusTest/OU=simpleCA-gridserver/CN=Globus Simple CA
If this is invalid, rerun this script
/usr/local/globus-4.0.5/setup/globus/setup-simple-ca
and enter the appropriate fields.
-------------------------------------------------------------------
The private key of the CA is stored in /home/globus/.globus/simpleCA//private/cakey.pem
The public CA certificate is stored in /home/globus/.globus/simpleCA//cacert.pem
The distribution package built for this CA is stored in
/home/globus/.globus/simpleCA//globus_simple_ca_379290e6_setup-0.19.tar.gz
This file must be distributed to any host wishing to request
certificates from this CA.
CA setup complete.
The following commands will now be run to setup the security
configuration files for this CA:
$GLOBUS_LOCATION/sbin/gpt-build /home/globus/.globus/simpleCA//globus_simple_ca_379290e6_setup-0.19.tar.gz
$GLOBUS_LOCATION/sbin/gpt-postinstall
-------------------------------------------------------------------
setup-ssl-utils: Configuring ssl-utils package
Running setup-ssl-utils-sh-scripts...
***************************************************************************
Note: To complete setup of the GSI software you need to run the
following script as root to configure your security configuration
directory:
/usr/local/globus-4.0.5/setup/globus_simple_ca_379290e6_setup/setup-gsi
For further information on using the setup-gsi script, use the -help
option. The -default option sets this security configuration to be
the default, and -nonroot can be used on systems where root access is
not available.
***************************************************************************
setup-ssl-utils: Complete
The following commands will now be run to setup the security configuration files for this CA:
$GLOBUS_LOCATION/sbin/gpt-build /home/globus/.globus/simpleCA//globus_simple_ca_379290e6_setup-0.19.tar.gz
$GLOBUS_LOCATION/sbin/gpt-postinstall
-------------------------------------------------------------------
setup-ssl-utils: Configuring ssl-utils package
Running setup-ssl-utils-sh-scripts...
***************************************************************************
Note: To complete setup of the GSI software you need to run the following script as root
to configure your security configuration directory:
/usr/local/globus-4.0.5/setup/globus_simple_ca_379290e6_setup/setup-gsi
For further information on using the setup-gsi script, use the -help
option. The -default option sets this security configuration to be
the default, and -nonroot can be used on systems where root access is
not available.
***************************************************************************
setup-ssl-utils: Complete
已经生成simpleCA了:

[globus@gridserver ~]$ ls ~/.globus/
simpleCA

[globus@gridserver ~]$ ls ~/.globus/simpleCA/
cacert.pem crl grid-ca-ssl.conf newcerts serial
certs globus_simple_ca_379290e6_setup-0.19.tar.gz index.txt private
That's the directory where my simpleCA has been created.

[globus@gridserver ~]$ exit
[root@gridserver local]# source $GLOBUS_LOCATION/etc/globus-user-env.sh
Now run it with the '-default' flag so that the CA we just created becomes the default
certificate authority for certificates created on this node:

[root@gridserver ~]# $GLOBUS_LOCATION/setup/globus_simple_ca_379290e6_setup/setup-gsi -default
setup-gsi: Configuring GSI security
Making /etc/grid-security...
mkdir /etc/grid-security
Making trusted certs directory: /etc/grid-security/certificates/
mkdir /etc/grid-security/certificates/
Installing /etc/grid-security/certificates//grid-security.conf.379290e6...
Running grid-security-config...
Installing Globus CA certificate into trusted CA certificate directory...
Installing Globus CA signing policy into trusted CA certificate directory...
setup-gsi: Complete

看看simpleCA的配置文件:
[root@gridserver local]# ls /etc/grid-security
certificates globus-host-ssl.conf globus-user-ssl.conf grid-security.conf

[root@gridserver ~]# ls /etc/grid-security/certificates
379290e6.0 globus-host-ssl.conf.379290e6 grid-security.conf.379290e6
379290e6.signing_policy globus-user-ssl.conf.379290e6
Those are the configuration files that establish trust for the simpleCA for my Globus Toolkit
installation. Notice that the hash value 379290e6 matches the hash value of my SimpleCA.

6. Next we want to request a host certificate for this node. After making the request we
will sign the certificate using the CA on this node. After the request is signed we will
install the certificate in the proper place on this node。

[root@gridserver globus]# grid-cert-request -host 'gridserver'
这个命令将创建下面的文件:
/etc/grid-security/hostkey.pem
/etc/grid-security/hostcert_request.pem
/etc/grid-security/hostcert.pem(这是一个空文件)
下面介绍如何创建hostcert(hostcert.pem),并将其放回/etc/grid-security/,覆盖hostcert.pem那个空文件。
On this node, We sign the certificate using our simpleCA as globus:

[root@gridserver globus]# su - globus
[globus@gridserver ~]$ grid-ca-sign -in /etc/grid-security/hostcert_request.pem -out hostsigned.pem
To sign the request
please enter the password for the CA key:

这里需要输入刚才的那个phrase;
签发的证书hostsigned.pem将在当前目录下生成
The new signed certificate is at: /home/globus/.globus/simpleCA//newcerts/01.pem

[globus@gridserver ~]$ exit
[root@gridserver ~]# cp ~globus/hostsigned.pem /etc/grid-security/hostcert.pem
The host certificate just created is owned by root and will be used by services such as
'globus-gridftp-server'. Most often the other services and the container they run in are not run as root.
They are run as user 'globus. Still, these services usually run with a host type certificate.
In the end, we need one host certificate/key owned by root, and one host certificate/key owned by globus.
We do that by copying the files:

[root@gridserver grid-security]# cp hostcert.pem containercert.pem
[root@gridserver grid-security]# cp hostkey.pem containerkey.pem
[root@gridserver grid-security]# chown globus:globus container*.pem
[root@gridserver grid-security]# ls -l *.pem
-rw-r--r-- 1 globus globus 2654 Sep 24 16:43 containercert.pem
-r-------- 1 globus globus 887 Sep 24 16:44 containerkey.pem
-rw-r--r-- 1 root root 2654 Sep 24 16:42 hostcert.pem
-rw-r--r-- 1 root root 1379 Sep 24 16:40 hostcert_request.pem
-r-------- 1 root root 887 Sep 24 16:40 hostkey.pem

7.Now we'll get a usercert for fitgrid
[root@gridserver grid-security]# su - fitgrid
[fitgrid@gridserver ~]$ grid-cert-request 提示输入密码后,将创建下面的文件:
/home/fitgrid/.globus/usercert.pem (空文件)
/home/fitgrid/.globus/userkey.pem
/home/fitgrid/.globus/usercert_request.pem
Enter your name, e.g., John Smith:fitgrid
A certificate request and private key is being created.
You will be asked to enter a PEM pass phrase.
This pass phrase is akin to your account password,
and is used to protect your key file.
If you forget your pass phrase, you will need to
obtain a new certificate.
Generating a 1024 bit RSA private key
............++++++
...............................++++++
writing new private key to '/home/fitgrid/.globus/userkey.pem'
Enter PEM pass phrase:hello
Verifying - Enter PEM pass phrase:hello
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Level 0 Organization [Grid]:Level 0 Organizational Unit [GlobusTest]:Level 1 Organizational Unit
[simpleCA-gridserver]:Level 2 Organizational Unit []:Name (e.g., John M. Smith) []:
A private key and a certificate request has been generated with the subject:
/O=Grid/OU=GlobusTest/OU=simpleCA-gridserver/CN=fitgrid
If the CN=fitgrid is not appropriate, rerun this script with the -force -cn "Common Name" options.
Your private key is stored in /home/fitgrid/.globus/userkey.pem
Your request is stored in /home/fitgrid/.globus/usercert_request.pem
Please e-mail the request to the Globus Simple CA
You may use a command similar to the following:
cat /home/fitgrid/.globus/usercert_request.pem mail
Only use the above if this machine can send AND receive e-mail. if not, please mail using some
other method.Your certificate will be mailed to you within two working days.
If you receive no response, contact Globus Simple CA at wzhfreedom@163.com
Now, sign it as user globus。本机没有开通email,所以手动进行:

[fitgrid@gridserver etc]$ exit
[root@gridserver .globus]# cp /home/fitgrid/.globus/usercert_request.pem /home/globus/
[root@gridserver .globus]# chown globus:globus /home/globus/usercert_request.pem
[globus@gridserver ~]$ su - globus

[globus@gridserver ~]$ grid-ca-sign -in usercert_request.pem -out signed.pem
To sign the request
please enter the password for the CA key:
The new signed certificate is at: /home/globus/.globus/simpleCA//newcerts/02.pem
Now user fitgrid copies the cert to the proper location:

[globus@gridserver ~]$ exit
[root@gridserver ~]# cp /home/globus/.globus/simpleCA//newcerts/02.pem /home/fitgrid/.globus/usercert.pem
cp: overwrite `/home/fitgrid/.globus/usercert.pem'? y

[root@gridserver ~]# su - fitgrid
[fitgrid@gridserver ~]$ source $GLOBUS_LOCATION/etc/globus-user-env.sh
[fitgrid@gridserver ~]$ grid-proxy-init -valid 24000:0
Your identity: /O=Grid/OU=GlobusTest/OU=simpleCA-gridserver/CN=fitgrid
Enter GRID pass phrase for this identity:
Creating proxy ............................ Done
Warning: your certificate and proxy will expire Tue Sep 23 22:19:42 2008
which is within the requested lifetime of the proxy
上面的这个Your identity:一会要用到

8.Our last act will be to create a grid-mapfile as root for authorization:
[root@gridserver grid-security]# vi /etc/grid-security/grid-mapfile
"/O=Grid/OU=GlobusTest/OU=simpleCA-gridserver/CN=fitgrid" fitgrid

就是把Your identity的内容加上双引号 + 空格 + username ,需要知道下面两个信息:
a. 用户的subject name: grid-cert-info -subject
b. 将要映射到的账号名:whoami
每个用户将占用该文件的一行。

三. 装 postgresql

Before starting the container and the grid services that run within it such as GRAM WS, we
need to configure the RFT service. RFT requires a relational database Postgres in order to
preserve state across machine shutdowns. So we need to initialize the database tables.

1.
[root@gridserver ~]# tar zxvf postgresql-8.2.5.tar.gz -C /usr/local/src/
[root@gridserver ~]# mkdir -p /usr/local/postsql-8.2.5/data
[root@gridserver ~]# chown -R postgre:postgre /usr/local/postsql-8.2.5/
[root@gridserver ~]# cd /usr/local/src/postgresql-8.2.5
[root@gridserver postgresql-8.2.5]# ./configure --prefix=/usr/local/postsql-8.2.5
[root@gridserver postgresql-8.2.5]# make && make install

2.设置环境变量:
[root@gridserver postsql-8.2.5]# vi /home/postgre/.bash_profile 末尾加上:
PGLIB=/usr/local/postsql-8.2.5/lib
PGDATA=/usr/local/postsql-8.2.5/data
PATH=/usr/local/postsql-8.2.5/bin:$PATH
export PGLIB PGDATA PATH
[root@gridserver postsql-8.2.5]# source /etc/profile

3.Initialize the database
[root@gridserver postsql-8.2.5]# su - postgre
[postgre@gridserver ~]$ initdb -D /usr/local/postsql-8.2.5/data

4.Starting the database:
[postgre@gridserver ~]$ postmaster -i -D /usr/local/postsql-8.2.5/data &
ps -A会发现3个postmaster进程,它们就是server进程。

[postgre@gridserver ~]$ createuser globus
Shall the new role be a superuser? (y/n) y
CREATE ROLE

5.Allow the 'globus' user to connect to the database from this host。
[postgre@gridserver data]$ exit
[root@gridserver ~]# vi /usr/local/postsql-8.2.5/data/pg_hba.conf 末尾加上:
# IPv4 local connections:
host all all 10.0.1.190 255.255.255.0 trust

6.kill postmaster and restart it:
[root@gridserver ~]# killall -9 postmaster
[root@gridserver ~]# postmaster -i -D /usr/local/postsql-8.2.5/data &

7.Create database and initialize table in that database:
[root@gridserver ~]# su - postgre
[postgre@gridserver ~]$ createdb rftDatabase
CREATE DATABASE

[postgre@gridserver ~]$ psql -d rftDatabase -f $GLOBUS_LOCATION/share/globus_wsrf_rft/rft_schema.sql
psql:/usr/local/globus-4.0.5/share/globus_wsrf_rft/rft_schema.sql:6:NOTICE:CREATE
TABLE / RIMARY KEY will create implicit index "requestid_pkey" for table "requestid"
CREATE TABLE
psql:/usr/local/globus-4.0.5/share/globus_wsrf_rft/rft_schema.sql:11:NOTICE:CREATE
TABLE / PRIMARY KEY will create implicit index "transferid_pkey" for table
"transferid"
CREATE TABLE
psql:/usr/local/globus-4.0.5/share/globus_wsrf_rft/rft_schema.sql:30: NOTICE:
CREATE TABLE / PRIMARY KEY will create implicit index "request_pkey" for table
"request"
CREATE TABLE
psql:/usr/local/globus-4.0.5/share/globus_wsrf_rft/rft_schema.sql:65: NOTICE:
CREATE TABLE / PRIMARY KEY will create implicit index "transfer_pkey" for table
"transfer"
CREATE TABLE
psql:/usr/local/globus-4.0.5/share/globus_wsrf_rft/rft_schema.sql:71: NOTICE:
CREATE TABLE / PRIMARY KEY will create implicit index "restart_pkey" for table
"restart"
CREATE TABLE
CREATE TABLE
CREATE INDEX

8.start container and the grid services that will run in it:
[postgre@gridserver ~]$ exit
[root@gridserver ~]$ su - globus
[globus@gridserver ~]$ source /usr/local/globus-4.0.5/etc/globus-user-env.sh
[globus@gridserver ~]$ globus-start-container 正确的信息如下:
Starting SOAP server at: https://10.0.1.191:8443/wsrf/services/
With the following services:
[1]: https://10.0.1.191:8443/wsrf/services/AdminService
[2]: https://10.0.1.191:8443/wsrf/services/AuthzCalloutTestService
[3]: https://10.0.1.191:8443/wsrf/services/CASService
[4]: https://10.0.1.191:8443/wsrf/services/ContainerRegistryEntryService
[5]: https://10.0.1.191:8443/wsrf/services/ContainerRegistryService
[6]: https://10.0.1.191:8443/wsrf/services/CounterService
[7]: https://10.0.1.191:8443/wsrf/services/DefaultIndexService
[8]: https://10.0.1.191:8443/wsrf/services/DefaultIndexServiceEntry
[9]: https://10.0.1.191:8443/wsrf/services/DefaultTriggerService
[10]: https://10.0.1.191:8443/wsrf/services/DefaultTriggerServiceEntry
[11]: https://10.0.1.191:8443/wsrf/services/DelegationFactoryService
[12]: https://10.0.1.191:8443/wsrf/services/DelegationService
[13]: https://10.0.1.191:8443/wsrf/services/DelegationTestService
[14]: https://10.0.1.191:8443/wsrf/services/InMemoryServiceGroup
[15]: https://10.0.1.191:8443/wsrf/services/InMemoryServiceGroupEntry
[16]: https://10.0.1.191:8443/wsrf/services/InMemoryServiceGroupFactory
[17]: https://10.0.1.191:8443/wsrf/services/IndexFactoryService
[18]: https://10.0.1.191:8443/wsrf/services/IndexService
[19]: https://10.0.1.191:8443/wsrf/services/IndexServiceEntry
[20]: https://10.0.1.191:8443/wsrf/services/ManagedExecutableJobService
[21]: https://10.0.1.191:8443/wsrf/services/ManagedJobFactoryService
[22]: https://10.0.1.191:8443/wsrf/services/ManagedMultiJobService
[23]: https://10.0.1.191:8443/wsrf/services/ManagementService
[24]: https://10.0.1.191:8443/wsrf/services/NotificationConsumerFactoryService
[25]: https://10.0.1.191:8443/wsrf/services/NotificationConsumerService
[26]: https://10.0.1.191:8443/wsrf/services/NotificationTestService
[27]: https://10.0.1.191:8443/wsrf/services/PersistenceTestSubscriptionManager
[28]: https://10.0.1.191:8443/wsrf/services/ReliableFileTransferFactoryService
[29]: https://10.0.1.191:8443/wsrf/services/ReliableFileTransferService
[30]: https://10.0.1.191:8443/wsrf/services/RendezvousFactoryService
[31]: https://10.0.1.191:8443/wsrf/services/ReplicationService
[32]: https://10.0.1.191:8443/wsrf/services/SampleAuthzService
[33]: https://10.0.1.191:8443/wsrf/services/SecureCounterService
[34]: https://10.0.1.191:8443/wsrf/services/SecurityTestService
[35]: https://10.0.1.191:8443/wsrf/services/ShutdownService
[36]: https://10.0.1.191:8443/wsrf/services/SubscriptionManagerService
[37]: https://10.0.1.191:8443/wsrf/services/TestAuthzService
[38]: https://10.0.1.191:8443/wsrf/services/TestRPCService
[39]: https://10.0.1.191:8443/wsrf/services/TestService
[40]: https://10.0.1.191:8443/wsrf/services/TestServiceRequest
[41]: https://10.0.1.191:8443/wsrf/services/TestServiceWrongWSDL
[42]: https://10.0.1.191:8443/wsrf/services/TriggerFactoryService
[43]: https://10.0.1.191:8443/wsrf/services/TriggerService
[44]: https://10.0.1.191:8443/wsrf/services/TriggerServiceEntry
[45]: https://10.0.1.191:8443/wsrf/services/Version
[46]: https://10.0.1.191:8443/wsrf/services/WidgetNotificationService
[47]: https://10.0.1.191:8443/wsrf/services/WidgetService
[48]: https://10.0.1.191:8443/wsrf/services/gsi/AuthenticationService
[49]: https://10.0.1.191:8443/wsrf/services/mds/test/execsource/IndexService
[50]: https://10.0.1.191:8443/wsrf/services/mds/test/execsource/IndexServiceEntry
[51]: https://10.0.1.191:8443/wsrf/services/mds/test/subsource/IndexService
[52]: https://10.0.1.191:8443/wsrf/services/mds/test/subsource/IndexServiceEntry
2007-09-25 20:56:24,735 INFO impl.DefaultIndexService [ServiceThread-12,
processConfigFile:107] Reading default registration configuration from file:
/usr/local/globus-4.0.5/etc/globus_wsrf_mds_index/hierarch:y.xml

这个shell就不动了,另外开一个shell。一会提交作业的时候,这个Shell会有相应的报错或者成功信
息。当然也可以让它在后台运行,并将错误信息保存到文件中:
[globus@gridserver postgre]$ globus-start-container > $HOME/container.out 2>&1 &

四. 配置GridFTP和GRAM

1. [root@gridserver ~]# vi /etc/xinetd.d/gridftp
service gsiftp
{
instances = 100
socket_type = stream
wait = no
user = root
env += GLOBUS_LOCATION=/usr/local/globus-4.0.5
env += LD_LIBRARY_PATH=/usr/local/globus-4.0.5/lib
server = /usr/local/globus-4.0.5/sbin/globus-gridftp-server
server_args = -i
log_on_success += DURATION
nice = 10
disable = no
}
2. [root@gridserver ~]# vi /etc/services 末尾加上:
# Local services
gsiftp 2811/tcp
[root@gridserver ~]# /etc/init.d/xinetd reload
[root@gridserver ~]# netstat -a |grep gsiftp
tcp 0 0 *:gsiftp *:* LISTEN
ftp已经起来了。

3.测试GridFTP:
[root@gridserver ~]# su -u fitgrid
[fitgrid@gridserver ~]$ source /usr/local/globus-4.0.5/etc/globus-user-env.sh
[fitgrid@gridserver ~]$ grid-proxy-init -verify -debug
User Cert File: /home/fitgrid/.globus/usercert.pem
User Key File: /home/fitgrid/.globus/userkey.pem
Trusted CA Cert Dir: /etc/grid-security/certificates
Output File: /tmp/x509up_u601
Your identity: /O=Grid/OU=GlobusTest/OU=simpleCA-gridserver/CN=fitgrid
Enter GRID pass phrase for this identity:
Creating proxy .....++++++++++++
...............++++++++++++
Done
Proxy Verify OK
Your proxy is valid until: Wed Sep 26 14:26:36 2007

[fitgrid@gridserver root]$ globus-url-copy gsiftp://gridserver/etc/group file:///tmp/fitgrid.test.copy
[fitgrid@gridserver ~]$ diff /tmp/fitgrid.test.copy /etc/group
[fitgrid@gridserver ~]$

成功了。继续配置GRAM(Globus Resource Allocation Manager)

A number of Globus grid services, including GRAM WS, need to use the 'sudo' command in
order to execute processes as the user that is requesting the service acton its behalf.

[root@gridserver ~]# visudo
末尾写上如下3行,这里容易出错,最好手动敲保持一行。直接copy可能有分隔符错误:
#Globus GRAM
globus ALL=(fitgrid) NOPASSWD: /usr/local/globus-4.0.5/libexec/ globus-gridmap-and-execute -g
/etc/grid-security/grid-mapfile /usr/local/globus-4.0.5/libexec/globus-job-manager-script.pl *

globus ALL=(fitgrid) NOPASSWD: /usr/local/globus-4.0.5/libexec/globus-gridmap-and-execute -g
/etc/grid-security/grid-mapfile /usr/local/globus-4.0.5/libexec/globus-gram-local-proxy-tool *

Those two lines will allow user 'globus' to use sudo to execute commands for user
'fitgrid'. We are using 'fitgrid' as a generic user account.

在提交作业前,首先要开启postsql,globus-start-contariner,gridftp,grid-proxy-init服务
先看最简单的fork-run GRAM:
[fitgrid@gridserver ~]$ source $GLOBUS_LOCATION/etc/globus-user-env.sh
[fitgrid@gridserver ~]$ globusrun-ws -submit -c /bin/true
Submitting job...Done.
Job ID: uuid:b684d8c4-6bab-11dc-9dcf-0019db71516a
Termination time: 09/26/2007 21:10 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
成功了!
如果出现以下错误:
Submitting job...Done.
Job ID: uuid:ef250a60-6b91-11dc-8e00-0019db71516a
Termination time: 09/26/2007 18:05 GMT
Current job state: Failed
Destroying job...Done.
globusrun-ws: Job failed: Error code: 201
Script stderr:
sudoers file: syntax error, line 32 /usr/bin/sudo: parse error in
/etc/sudoers near line 32
You have new mail in /var/spool/mail/root

回头检查sudoers文件,注意直接copy导致的换行会产生错误。不行就关掉globus-start-container重
新启动它一下试试。继续,出现了如下错误:
Submitting job...Done.
Job ID: uuid:cb590fa2-76d4-11dc-ac9b-0019db71516a
Termination time: 10/11/2007 02:02 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Failed
Destroying job...Done.

修改/usr/local/globus-4.0.5/container-log4j.properties,将其中的
# Display any warnings generated by our code
log4j.category.org.globus=INFO

改成:
# Display any warnings generated by our code
log4j.category.org.globus=DEBUG

然后重启container,再提交作业就成功了。然后再把DEBUG改回成INFO,仍然能成功。
但重启container后,有时候又不成功了,还要修改log4j.category.org.globus=INFO
再测一个xml文件:
[fitgrid@gridserver ~]$ vi job2.xml
<job>
<executable>my_echo</executable>
<directory>/home/fitgrid</directory>
<argument>Hello</argument>
<argument>World!</argument>
<stdout>/home/fitgrid/stdout</stdout>
<stderr>/home/fitgrid/stderr</stderr>
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://gridserver:2811/bin/echo</sourceUrl>
<destinationUrl>file:///home/fitgrid/my_echo</destinationUrl>
</transfer>
</fileStageIn>
<fileStageOut>
<transfer>
<sourceUrl>file:///home/fitgrid/stdout</sourceUrl>
<destinationUrl>gsiftp://gridserver:2811/tmp/stdout</destinationUrl>
</transfer>
</fileStageOut>
<fileCleanUp>
<deletion>
<file>file:///home/fitgrid/my_echo</file>
</deletion>
</fileCleanUp>
</job>

[fitgrid@gridserver ~]$ globusrun-ws -submit -S -f /home/fitgrid/job2.xml
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:5844c630-78a2-11dc-b7c2-0019db71516a
Termination time: 10/13/2007 09:05 GMT
Current job state: StageIn
Current job state: Active
Current job state: StageOut
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.

[fitgrid@gridserver ~]$ ls -l
-rw-r--r-- 1 fitgrid fitgrid 56 Oct 12 17:11 stdout
[fitgrid@gridserver ~]$cat stdout
Hello World!

[fitgrid@gridserver ~]$ ls -l /tmp
-rw-r--r-- 1 fitgrid fitgrid 56 Oct 12 17:11 stdout

[fitgrid@gridserver ~]$cat stdout
Hello World!
也成功了! 如果想把结果输出到当前shell中,可以加 -s :

[fitgrid@gridserver ~]$ globusrun-ws -submit -s -c /bin/date
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:8f15d35a-7b1f-11dc-94ae-0019db71516a
Termination time: 10/16/2007 13:07 GMT
Current job state: Active
Current job state: CleanUp-Hold
Mon Oct 15 21:07:20 CST 2007
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.
如果想把结果输出到文件中,可以加 -so :

[fitgrid@gridserver ~]$ globusrun-ws -submit -s -so job.out -c /bin/date
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:6c359248-7b20-11dc-ba15-0019db71516a
Termination time: 10/16/2007 13:13 GMT
Current job state: Active
Current job state: CleanUp-Hold
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.

[fitgrid@gridserver ~]$ cat job.out
Mon Oct 15 21:13:31 CST 2007

注意参数的顺序,-c后的参数将会被忽略。如果想部署一个服务,可以看这里:
http://gdp.globus.org/gt4-tutorial/ 这个pdf里正好也包含认证的知识,可以学一下。

这就够了,如果想用pbs提交作业,那就继续看:

五. 用pbs提交作业
1.首先装好OpenPBS,而且要求pbs server就是本机.如果pbs server 放在其他机器上,我没有安装成功
2.此时GT的安装几乎和上面不用pbs的情况完全一样,只需编译时加上--enable-wsgram-pbs :
./configure --prefix=/usr/local/globus-4.0.5 --enable-wsgram-pbs && make && make install
出现的WARNING可以不影响,忽略。
globus用户需要能够对scheduler logs有r权限,这里就是$PBS_HOME/server_logs。现在看看:

[globus@gridserver ~]$ cat /usr/local/globus-4.0.5/etc/globus-pbs.conf
log_path=/usr/spool/PBS/server_logs
上面指示了日志路径。
[globus@gridserver ~]$ ls -l /usr/spool/PBS/server_logs
total 40
-rw-r--r-- 1 root root 33162 Oct 15 10:54 20071015
已经有了r权限

3.用pbs来提交作业
[root@gridserver ~]# su - globus
[globus@gridserver ~]$ cd gt4.0.5-x86_rhas_4-installer
[globus@gridserver gt4.0.5-x86_rhas_4-installer]# make gt4-gram-pbs && make install
[globus@gridserver gt4.0.5-x86_rhas_4-installer]$ cd $GLOBUS_LOCATION/setup/globus

[globus@gridserver globus]$ ./setup-globus-job-manager-pbs --remote-shell=rsh
find-pbs-tools: WARNING: "Cannot locate mpiexec"
find-pbs-tools: WARNING: "Cannot locate mpirun"
checking for mpiexec... no
checking for mpirun... no
checking for qdel... /usr/local/bin/qdel
checking for qstat... /usr/local/bin/qstat
checking for qsub... /usr/local/bin/qsub
checking for rsh... /usr/kerberos/bin/rsh
find-pbs-tools:creating ./config.status
config.status:creating /usr/local/globus-4.0.5/lib/perl/Globus/GRAM/
JobManager/pbs.pm

[fitgrid@gridserver ~]$ globusrun-ws -Ft PBS -submit -S -f /home/fitgrid/job2.xml
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:8e294524-7aef-11dc-9814-0019db71516a
Termination time: 10/16/2007 07:23 GMT
Current job state: StageIn
Current job state: Active
Current job state: StageOut
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.

[fitgrid@gridserver ~]$ ls -l
-rw-r--r-- 1 fitgrid fitgrid 0 Oct 15 15:23 stderr
-rw-r--r-- 1 fitgrid fitgrid 13 Oct 15 15:23 stdout

[fitgrid@gridserver ~]$cat stdout
Hello World!

[fitgrid@gridserver ~]$ ls -l /tmp
-rw-r--r-- 1 fitgrid fitgrid 0 Oct 15 15:23 stderr
-rw-r--r-- 1 fitgrid fitgrid 13 Oct 15 15:23 stdout

[fitgrid@gridserver ~]$ cat /tmp/stdout
Hello World!

这下成功了!
再试一个,以前用OpenPBS提交作业的时候用qsub sample.sh,现在用xml文件和GT来完成:
[fitgrid@gridserver ~]$ vi sample.sh
#!/bin/bash
echo `date` >> /home/fitgrid/sample.out
sleep 20
echo `date` >> /home/fitgrid/sample.out

[fitgrid@gridserver ~]$ vi sample.xml
<job>
<executable>qsub</executable>
<directory>/usr/local/bin</directory>
<argument>/home/fitgrid/sample.sh</argument>
<stdout>/home/fitgrid/stdout</stdout>
<stderr>/home/fitgrid/stderr</stderr>
</job>

[fitgrid@gridserver ~]$ globusrun-ws -submit -F
https://10.0.1.191:8443/wsrf/services/ManagedJobFactoryService -Ft PBS -S -f /home/fitgrid/sample.xml

No comments:

Post a Comment