Setting up a jupyter fastai ready environment on docker with 4 GPUs

First ensure that all the the. GPUs are visible using docker-nvidia

root@gpu-server:/home/mano# docker run --gpus all nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04 nvidia-smi

==========
== CUDA ==
==========

CUDA Version 11.6.2

Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Fri Jan  6 00:36:05 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   51C    P8    11W / 230W |    100MiB /  8118MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:02:00.0 Off |                  N/A |
|  0%   52C    P8     9W / 230W |      8MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce ...  Off  | 00000000:04:00.0 Off |                  N/A |
|  0%   48C    P8    10W / 230W |      8MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA GeForce ...  Off  | 00000000:05:00.0 Off |                  N/A |
|  0%   44C    P8     9W / 230W |      8MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
root@gpu-server:/home/mano# 

Then get the docker image with GPU enabled :

root@gpu-server:~# docker run --gpus all -d -it -p 8848:8888 -v $(pwd)/data:/home/jovyan/work -e GRANT_SUDO=yes -e JUPYTER_ENABLE_LAB=yes --user root cschranz/gpu-jupyter:v1.4_cuda-11.2_ubuntu-20.04_python-only

Saving dates as metadata


-sh-4.2$ aws s3api put-object --bucket storagegrid-training --key "test.file" --body "test.file" --metadata Change_date=$(stat test.file|grep Change|awk '{print $2}'),Change_time=$(stat test.file|grep Change|awk '{print $3}') --profile default
{
"ETag": "\"d41d8cd98f00b204e9800998ecf8427e\""
}




-sh-4.2$ aws s3api get-object --bucket storagegrid-training --key "test.file" "test.file"  --profile default {
"AcceptRanges": "bytes",
"LastModified": "2021-12-02T11:21:50+00:00",
"ContentLength": 0,
"ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
"ContentType": "binary/octet-stream",
"Metadata": {
"change_date": "2021-12-02",
"change_time": "12:20:48.279373897"
}
}
-sh-4.2$


If you are using elasticsearch as external index you can then search for metadata fields like:

GET sgmetadata/_search
{
“query”: {
“term” : { “metadata.change_date” : “2023-01-17”}
}
}

{
“took”: 1,
“timed_out”: false,
“_shards”: {
“total”: 1,
“successful”: 1,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 1,
“relation”: “eq”
},
“max_score”: 1,
“hits”: [
{
“_index”: “sgmetadata”,
“_id”: “lock-test_elasticsearch3.test_MUE4NThCQ0EtQkQ4QS0xMUVELTk5Q0QtQTZFOTAwQjk0NkFF”,
“_score”: 1,
“_source”: {
“bucket”: “lock-test”,
“key”: “elasticsearch3.test”,
“versionId”: “MUE4NThCQ0EtQkQ4QS0xMUVELTk5Q0QtQTZFOTAwQjk0NkFF”,
“accountId”: “32427727073175632701”,
“size”: 40960000,
“md5”: “4081a22a1c2ef19e3c44ef14c8006fda”,
“region”: “us-east-1”,
“metadata”: {
“change_date”: “2023-01-17”,
“change_time”: “15:19:36.500205796”
}
}
}
]
}
}

Modify consistency level for all your buckets in Storagegrid.

Sometimes it might make sense to set all your buckets consistency level to “available”, in case you are doing maintenance tasks and you have one node down most of the time.

The default “read-after-write” policy should work ok, unless your clients start doing HEAD operations which will lead to get 500 errors when Storagegrid is unable to reach the consistency level.

In order to change all the consistency level of all your buckets at once, we can use an API call to do that, we have define it as follows:

def get_consistency_level(tenant_authtoken,bucket_name):
     headers={'Authorization': 'Bearer ' + tenant_authtoken }
     return requests.get (_url('/api/v3/org/containers/{}/consistency'.format(bucket_name)), headers=headers , verify=verify)


def set_consistency_level(tenant_authtoken,bucket_name,level):

     headers={'Authorization': 'Bearer ' + tenant_authtoken,
     "accept": "application/json",
     "Content-Type": "application/json" 
     }

     data={
            'consistency': level
     }   


     r = requests.put(_url('/api/v3/org/containers/{}/consistency').format(bucket_name), json=data, headers=headers, verify=verify)

     return r 

Then we go though all the buckets and change the consistency level:

response = get_tenants_accounts(auth_token)

if response.status_code != 200:
    raise Exception('POST //api/v3/grid/accounts?limit=25 {}'.format(response.status_code) + " Error: "+response.json()["message"]["text"] )

#For each tenant account get the buckets:

for items in response.json()['data']:
   #print('Account tenant id: {}, Tenant Name: {}'.format(items['id'], items['name']))
   tenantid='{}'.format(items['id'])
   tenant_name='{}'.format(items['name'])
   buckets_response=get_storage_usage_in_tenant( tenantid , auth_token)

   tenant_auth=get_tenant_token(api_user,api_passwd, tenantid).json()["data"]



   for buckets in buckets_response.json()['data']['buckets']:
          
           print ('Tenant name: {} Bucket name: {} Consistency Level: {}' .format (items['name'],buckets['name'],get_consistency_level(tenant_auth,buckets['name']).json()['data'])) 

           setresponse=set_consistency_level(tenant_auth,buckets['name'],"available")
           if setresponse.status_code != 200:
             raise Exception('POST set consistency level: {}'.format(setresponse.status_code) + " Error: "+setresponse.json()["message"]["text"] )
            

ssh X forwarding after sudo

In order to be able to forward your ssh -X or ssh -Y after doing sudo we need to use Xauth.

-sh-4.2$ ssh -Y mano@server12

-sh-4.2$ xauth list|tail -1
server12.sunwave.es:10  MIT-MAGIC-COOKIE-1  38a81b09365e1b5d13c50ad53d378a78

-sh-4.2$ sudo su
[sudo] password for mano:

root@server12 mano]# xauth 
Using authority file /root/.Xauthority
xauth> add server12.sunwave.es:10  MIT-MAGIC-COOKIE-1  38a81b09365e1b5d13c50ad53d378a78
xauth> exit
Writing authority file /root/.Xauthority
[root@server12 mano]# xeyes

Now we can launch GUI as root, and the traffic will be forwarded trough our ssh session.