codejedi commited on
Commit
5f7fb7e
·
1 Parent(s): 90ab1fb

Fix NLTK download syntax error in Dockerfile by using separate Python script

Browse files
Files changed (2) hide show
  1. Dockerfile +1 -15
  2. download_nltk_data.py +20 -0
Dockerfile CHANGED
@@ -23,21 +23,7 @@ RUN pip install --no-cache-dir --upgrade pip setuptools wheel && \
23
  COPY . .
24
 
25
  # Download NLTK data during build with error handling
26
- RUN python -c "import nltk; import os; \
27
- nltk_data_dir = '/root/nltk_data'; \
28
- os.makedirs(nltk_data_dir, exist_ok=True); \
29
- nltk.data.path.append(nltk_data_dir); \
30
- try: \
31
- nltk.download('punkt', quiet=True); \
32
- print('Downloaded punkt'); \
33
- except Exception as e: \
34
- print(f'Failed to download punkt: {e}'); \
35
- try: \
36
- nltk.download('vader_lexicon', quiet=True); \
37
- print('Downloaded vader_lexicon'); \
38
- except Exception as e: \
39
- print(f'Failed to download vader_lexicon: {e}'); \
40
- print('NLTK download step completed')"
41
 
42
  # Expose the port the app runs on
43
  EXPOSE 7860
 
23
  COPY . .
24
 
25
  # Download NLTK data during build with error handling
26
+ RUN python download_nltk_data.py
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  # Expose the port the app runs on
29
  EXPOSE 7860
download_nltk_data.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Script to download NLTK data during Docker build"""
2
+ import nltk
3
+ import os
4
+
5
+ # Set NLTK data directory
6
+ nltk_data_dir = '/root/nltk_data'
7
+ os.makedirs(nltk_data_dir, exist_ok=True)
8
+ nltk.data.path.append(nltk_data_dir)
9
+
10
+ # Download required NLTK data
11
+ required_data = ['punkt', 'vader_lexicon']
12
+ for data_name in required_data:
13
+ try:
14
+ nltk.download(data_name, quiet=True)
15
+ print(f'Downloaded {data_name}')
16
+ except Exception as e:
17
+ print(f'Failed to download {data_name}: {e}')
18
+
19
+ print('NLTK download step completed')
20
+