>>> py3-tika: Building community/py3-tika 3.1.0-r0 (using abuild 3.15.0-r0) started Mon, 13 Oct 2025 05:07:47 +0000 >>> py3-tika: Validating /home/udu/aports/community/py3-tika/APKBUILD... >>> py3-tika: Analyzing dependencies... >>> py3-tika: Installing for build: build-base py3-requests py3-setuptools py3-gpep517 py3-wheel py3-pytest py3-pytest-benchmark py3-pytest-cov py3-coveralls py3-yaml openjdk21-jre-headless WARNING: opening /home/udu/packages//community: No such file or directory WARNING: opening /home/udu/packages//main: No such file or directory fetch http://dl-cdn.alpinelinux.org/alpine/v3.22/main/x86_64/APKINDEX.tar.gz fetch http://dl-cdn.alpinelinux.org/alpine/v3.22/community/x86_64/APKINDEX.tar.gz (1/52) Installing py3-certifi-pyc (2025.4.26-r0) (2/52) Installing py3-charset-normalizer (3.4.2-r0) (3/52) Installing py3-charset-normalizer-pyc (3.4.2-r0) (4/52) Installing py3-idna (3.10-r0) (5/52) Installing py3-idna-pyc (3.10-r0) (6/52) Installing py3-urllib3 (1.26.20-r0) (7/52) Installing py3-urllib3-pyc (1.26.20-r0) (8/52) Installing py3-requests-pyc (2.32.5-r0) (9/52) Installing py3-certifi (2025.4.26-r0) (10/52) Installing py3-requests (2.32.5-r0) (11/52) Installing py3-parsing (3.2.3-r0) (12/52) Installing py3-parsing-pyc (3.2.3-r0) (13/52) Installing py3-packaging (25.0-r0) (14/52) Installing py3-packaging-pyc (25.0-r0) (15/52) Installing py3-setuptools (80.9.0-r0) (16/52) Installing py3-setuptools-pyc (80.9.0-r0) (17/52) Installing py3-installer (0.7.0-r2) (18/52) Installing py3-installer-pyc (0.7.0-r2) (19/52) Installing py3-gpep517 (19-r0) (20/52) Installing py3-gpep517-pyc (19-r0) (21/52) Installing py3-wheel (0.46.1-r0) (22/52) Installing py3-wheel-pyc (0.46.1-r0) (23/52) Installing py3-iniconfig (2.1.0-r0) (24/52) Installing py3-iniconfig-pyc (2.1.0-r0) (25/52) Installing py3-pluggy (1.5.0-r0) (26/52) Installing py3-pluggy-pyc (1.5.0-r0) (27/52) Installing py3-py (1.11.0-r4) (28/52) Installing py3-py-pyc (1.11.0-r4) (29/52) Installing py3-pytest (8.3.5-r0) (30/52) Installing py3-pytest-pyc (8.3.5-r0) (31/52) Installing py3-py-cpuinfo (9.0.0-r4) (32/52) Installing py3-py-cpuinfo-pyc (9.0.0-r4) (33/52) Installing py3-pytest-benchmark (4.0.0-r4) (34/52) Installing py3-pytest-benchmark-pyc (4.0.0-r4) (35/52) Installing py3-coverage (7.8.2-r0) (36/52) Installing py3-coverage-pyc (7.8.2-r0) (37/52) Installing py3-pytest-cov (5.0.0-r0) (38/52) Installing py3-pytest-cov-pyc (5.0.0-r0) (39/52) Installing py3-docopt (0.6.2-r11) (40/52) Installing py3-docopt-pyc (0.6.2-r11) (41/52) Installing py3-coveralls (3.3.1-r1) (42/52) Installing py3-coveralls-pyc (3.3.1-r1) (43/52) Installing yaml (0.2.5-r2) (44/52) Installing py3-yaml (6.0.2-r0) (45/52) Installing py3-yaml-pyc (6.0.2-r0) (46/52) Installing java-common (1.0-r0) (47/52) Installing libtasn1 (4.20.0-r0) (48/52) Installing p11-kit (0.25.5-r2) (49/52) Installing p11-kit-trust (0.25.5-r2) (50/52) Installing java-cacerts (1.1-r0) (51/52) Installing openjdk21-jre-headless (21.0.8_p9-r0) (52/52) Installing .makedepends-py3-tika (20251013.050748) Executing busybox-1.37.0-r19.trigger Executing java-common-1.0-r0.trigger Executing ca-certificates-20250911-r0.trigger OK: 505 MiB in 141 packages >>> py3-tika: Cleaning up srcdir >>> py3-tika: Cleaning up pkgdir >>> py3-tika: Cleaning up tmpdir >>> py3-tika: Fetching py3-tika-3.1.0-gh.tar.gz::https://github.com/chrismattmann/tika-python/archive/refs/tags/3.1.0.tar.gz >>> py3-tika: Fetching py3-tika-3.1.0-gh.tar.gz::https://github.com/chrismattmann/tika-python/archive/refs/tags/3.1.0.tar.gz >>> py3-tika: Checking sha512sums... py3-tika-3.1.0-gh.tar.gz: OK >>> py3-tika: Unpacking /var/cache/distfiles/py3-tika-3.1.0-gh.tar.gz... 2025-10-13 05:07:49,549 gpep517 INFO Building wheel via backend setuptools.build_meta:__legacy__ /home/udu/aports/community/py3-tika/src/tika-python-3.1.0/tika/__init__.py:20: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. __import__('pkg_resources').declare_namespace(__name__) /usr/lib/python3.12/site-packages/setuptools/_distutils/dist.py:289: UserWarning: Unknown distribution option: 'test_suite' warnings.warn(msg) /usr/lib/python3.12/site-packages/setuptools/dist.py:759: SetuptoolsDeprecationWarning: License classifiers are deprecated. !! ******************************************************************************** Please consider removing the following classifiers in favor of a SPDX license expression: License :: OSI Approved :: Apache Software License See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details. ******************************************************************************** !! self._finalize_license_expression() 2025-10-13 05:07:49,577 root INFO running bdist_wheel 2025-10-13 05:07:49,588 root INFO running build 2025-10-13 05:07:49,588 root INFO running build_py 2025-10-13 05:07:49,591 root INFO creating build/lib/tika 2025-10-13 05:07:49,591 root INFO copying tika/language.py -> build/lib/tika 2025-10-13 05:07:49,591 root INFO copying tika/parser.py -> build/lib/tika 2025-10-13 05:07:49,591 root INFO copying tika/detector.py -> build/lib/tika 2025-10-13 05:07:49,591 root INFO copying tika/tika.py -> build/lib/tika 2025-10-13 05:07:49,591 root INFO copying tika/translate.py -> build/lib/tika 2025-10-13 05:07:49,591 root INFO copying tika/config.py -> build/lib/tika 2025-10-13 05:07:49,591 root INFO copying tika/__init__.py -> build/lib/tika 2025-10-13 05:07:49,592 root INFO copying tika/pdf.py -> build/lib/tika 2025-10-13 05:07:49,592 root INFO copying tika/unpack.py -> build/lib/tika 2025-10-13 05:07:49,592 root INFO creating build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/test_ssl_link.py -> build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/tests_unpack.py -> build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/memory_benchmark.py -> build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/test_from_file_service.py -> build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/utils.py -> build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/test_tika.py -> build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/tests_params.py -> build/lib/tika/tests 2025-10-13 05:07:49,592 root INFO copying tika/tests/test_benchmark.py -> build/lib/tika/tests 2025-10-13 05:07:49,593 root INFO copying tika/tests/__init__.py -> build/lib/tika/tests 2025-10-13 05:07:49,593 root INFO running egg_info 2025-10-13 05:07:49,594 root INFO creating tika.egg-info 2025-10-13 05:07:49,594 root INFO writing tika.egg-info/PKG-INFO 2025-10-13 05:07:49,595 root INFO writing dependency_links to tika.egg-info/dependency_links.txt 2025-10-13 05:07:49,595 root INFO writing entry points to tika.egg-info/entry_points.txt 2025-10-13 05:07:49,596 root INFO writing requirements to tika.egg-info/requires.txt 2025-10-13 05:07:49,596 root INFO writing top-level names to tika.egg-info/top_level.txt 2025-10-13 05:07:49,596 root INFO writing manifest file 'tika.egg-info/SOURCES.txt' 2025-10-13 05:07:49,598 root INFO reading manifest file 'tika.egg-info/SOURCES.txt' 2025-10-13 05:07:49,598 root INFO adding license file 'LICENSE.txt' 2025-10-13 05:07:49,599 root INFO writing manifest file 'tika.egg-info/SOURCES.txt' 2025-10-13 05:07:49,603 root INFO installing to build/bdist.linux-x86_64/wheel 2025-10-13 05:07:49,603 root INFO running install 2025-10-13 05:07:49,608 root INFO running install_lib 2025-10-13 05:07:49,610 root INFO creating build/bdist.linux-x86_64/wheel 2025-10-13 05:07:49,610 root INFO creating build/bdist.linux-x86_64/wheel/tika 2025-10-13 05:07:49,610 root INFO copying build/lib/tika/language.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,610 root INFO copying build/lib/tika/parser.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,610 root INFO copying build/lib/tika/detector.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,610 root INFO copying build/lib/tika/tika.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/translate.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/config.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/__init__.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/pdf.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,611 root INFO creating build/bdist.linux-x86_64/wheel/tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/test_ssl_link.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/tests_unpack.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/memory_benchmark.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/test_from_file_service.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/utils.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/test_tika.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/tests_params.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/test_benchmark.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/tests/__init__.py -> build/bdist.linux-x86_64/wheel/./tika/tests 2025-10-13 05:07:49,611 root INFO copying build/lib/tika/unpack.py -> build/bdist.linux-x86_64/wheel/./tika 2025-10-13 05:07:49,611 root INFO running install_egg_info 2025-10-13 05:07:49,614 root INFO Copying tika.egg-info to build/bdist.linux-x86_64/wheel/./tika-3.1.0-py3.12.egg-info 2025-10-13 05:07:49,614 root INFO running install_scripts 2025-10-13 05:07:49,615 root INFO creating build/bdist.linux-x86_64/wheel/tika-3.1.0.dist-info/WHEEL 2025-10-13 05:07:49,615 wheel INFO creating '/home/udu/aports/community/py3-tika/src/tika-python-3.1.0/.dist/.tmp-i6vs0v5q/tika-3.1.0-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it 2025-10-13 05:07:49,615 wheel INFO adding 'tika/__init__.py' 2025-10-13 05:07:49,615 wheel INFO adding 'tika/config.py' 2025-10-13 05:07:49,615 wheel INFO adding 'tika/detector.py' 2025-10-13 05:07:49,615 wheel INFO adding 'tika/language.py' 2025-10-13 05:07:49,615 wheel INFO adding 'tika/parser.py' 2025-10-13 05:07:49,615 wheel INFO adding 'tika/pdf.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tika.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/translate.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/unpack.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/__init__.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/memory_benchmark.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/test_benchmark.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/test_from_file_service.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/test_ssl_link.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/test_tika.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/tests_params.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/tests_unpack.py' 2025-10-13 05:07:49,616 wheel INFO adding 'tika/tests/utils.py' 2025-10-13 05:07:49,617 wheel INFO adding 'tika-3.1.0.dist-info/licenses/LICENSE.txt' 2025-10-13 05:07:49,617 wheel INFO adding 'tika-3.1.0.dist-info/METADATA' 2025-10-13 05:07:49,617 wheel INFO adding 'tika-3.1.0.dist-info/WHEEL' 2025-10-13 05:07:49,617 wheel INFO adding 'tika-3.1.0.dist-info/entry_points.txt' 2025-10-13 05:07:49,617 wheel INFO adding 'tika-3.1.0.dist-info/top_level.txt' 2025-10-13 05:07:49,617 wheel INFO adding 'tika-3.1.0.dist-info/zip-safe' 2025-10-13 05:07:49,617 wheel INFO adding 'tika-3.1.0.dist-info/RECORD' 2025-10-13 05:07:49,617 root INFO removing build/bdist.linux-x86_64/wheel 2025-10-13 05:07:49,617 gpep517 INFO The backend produced .dist/tika-3.1.0-py3-none-any.whl tika-3.1.0-py3-none-any.whl *** Error compiling '/home/udu/aports/community/py3-tika/src/tika-python-3.1.0/.testenv/lib/python3.12/site-packages/tika/pdf.py'... File "/home/udu/aports/community/py3-tika/src/tika-python-3.1.0/.testenv/lib/python3.12/site-packages/tika/pdf.py", line 19 import .tika.parser ^ SyntaxError: invalid syntax ============================================================================================================================================ test session starts ============================================================================================================================================= platform linux -- Python 3.12.11, pytest-8.3.5, pluggy-1.5.0 -- /home/udu/aports/community/py3-tika/src/tika-python-3.1.0/.testenv/bin/python3 cachedir: .pytest_cache benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/udu/aports/community/py3-tika/src/tika-python-3.1.0 plugins: cov-5.0.0, benchmark-4.0.0 collecting ... collected 23 items / 2 deselected / 21 selected tika/tests/test_benchmark.py::test_local_binary FAILED [ 4%] tika/tests/test_benchmark.py::test_parser_buffer FAILED [ 9%] tika/tests/test_benchmark.py::test_parser_buffer_zlib_input FAILED [ 14%] tika/tests/test_benchmark.py::test_parser_buffer_gzip_input FAILED [ 19%] tika/tests/test_benchmark.py::test_local_binary_with_gzip_output FAILED [ 23%] tika/tests/test_benchmark.py::test_parser_buffer_with_gzip_output FAILED [ 28%] tika/tests/test_benchmark.py::test_parser_buffer_zlib_input_and_gzip_output FAILED [ 33%] tika/tests/test_benchmark.py::test_parser_buffer_gzip_input_and_gzip_output FAILED [ 38%] tika/tests/test_from_file_service.py::CreateTest::test_default_service FAILED [ 42%] tika/tests/test_from_file_service.py::CreateTest::test_default_service_explicit FAILED [ 47%] tika/tests/test_from_file_service.py::CreateTest::test_invalid_service FAILED [ 52%] tika/tests/test_from_file_service.py::CreateTest::test_meta_service FAILED [ 57%] tika/tests/test_from_file_service.py::CreateTest::test_remote_endpoint PASSED [ 61%] tika/tests/test_from_file_service.py::CreateTest::test_text_service FAILED [ 66%] tika/tests/test_tika.py::CreateTest::test_kill_server FAILED [ 71%] tika/tests/test_tika.py::CreateTest::test_local_binary FAILED [ 76%] tika/tests/test_tika.py::CreateTest::test_local_buffer FAILED [ 80%] tika/tests/test_tika.py::CreateTest::test_local_path FAILED [ 85%] tika/tests/test_tika.py::CreateTest::test_remote_html FAILED [ 90%] tika/tests/test_tika.py::CreateTest::test_remote_mp3 FAILED [ 95%] tika/tests/test_tika.py::CreateTest::test_remote_pdf FAILED [100%] ================================================================================================================================================== FAILURES ================================================================================================================================================== _____________________________________________________________________________________________________________________________________________ test_local_binary ______________________________________________________________________________________________________________________________________________ benchmark = def test_local_binary(benchmark): """parse file binary""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_binary, file) tika/tests/test_benchmark.py:32: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:112: in tika_from_binary return tika.parser.from_file(file_obj, headers=headers) tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:07:51,065 [MainThread ] [INFO ] Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar to /tmp/tika-server.jar. 2025-10-13 05:07:52,211 [MainThread ] [INFO ] Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar.md5 to /tmp/tika-server.jar.md5. 2025-10-13 05:07:52,552 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:07:57,552 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:02,553 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:07,553 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:08:07,553 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:802 Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar to /tmp/tika-server.jar. INFO tika.tika:tika.py:802 Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar.md5 to /tmp/tika-server.jar.md5. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _____________________________________________________________________________________________________________________________________________ test_parser_buffer _____________________________________________________________________________________________________________________________________________ benchmark = def test_parser_buffer(benchmark): """example how to send gzip file""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_buffer, file) tika/tests/test_benchmark.py:40: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:107: in tika_from_buffer return tika.parser.from_buffer(file_obj.read(), headers=headers) tika/parser.py:65: in from_buffer status, response = callServer('put', serverEndpoint, '/rmeta/text', string, headers, False, config_path=config_path, requestOptions=requestOptions) tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:08:07,685 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:12,686 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:17,686 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:22,686 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:08:22,686 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _______________________________________________________________________________________________________________________________________ test_parser_buffer_zlib_input ________________________________________________________________________________________________________________________________________ benchmark = def test_parser_buffer_zlib_input(benchmark): """example how to send gzip file""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_buffer_zlib, file) tika/tests/test_benchmark.py:49: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:97: in tika_from_buffer_zlib return tika.parser.from_buffer(zlib.compress(file_obj.read()), headers=headers) tika/parser.py:65: in from_buffer status, response = callServer('put', serverEndpoint, '/rmeta/text', string, headers, False, config_path=config_path, requestOptions=requestOptions) tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:08:22,806 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:27,806 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:32,806 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:37,806 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:08:37,807 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _______________________________________________________________________________________________________________________________________ test_parser_buffer_gzip_input ________________________________________________________________________________________________________________________________________ benchmark = def test_parser_buffer_gzip_input(benchmark): """parse file binary""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_buffer_gzip, file) tika/tests/test_benchmark.py:57: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:102: in tika_from_buffer_gzip return tika.parser.from_buffer(gzip_compress(file_obj.read()), headers=headers) tika/parser.py:65: in from_buffer status, response = callServer('put', serverEndpoint, '/rmeta/text', string, headers, False, config_path=config_path, requestOptions=requestOptions) tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:08:37,934 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:42,935 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:47,935 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:52,935 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:08:52,935 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _____________________________________________________________________________________________________________________________________ test_local_binary_with_gzip_output _____________________________________________________________________________________________________________________________________ benchmark = def test_local_binary_with_gzip_output(benchmark): """parse file binary""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_binary, file, headers={'Accept-Encoding': 'gzip, deflate'}) tika/tests/test_benchmark.py:65: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:112: in tika_from_binary return tika.parser.from_file(file_obj, headers=headers) tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:08:53,053 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:08:58,054 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:03,054 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:08,054 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:09:08,054 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ____________________________________________________________________________________________________________________________________ test_parser_buffer_with_gzip_output _____________________________________________________________________________________________________________________________________ benchmark = def test_parser_buffer_with_gzip_output(benchmark): """example how to send gzip file""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_buffer, file, headers={'Accept-Encoding': 'gzip, deflate'}) tika/tests/test_benchmark.py:73: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:107: in tika_from_buffer return tika.parser.from_buffer(file_obj.read(), headers=headers) tika/parser.py:65: in from_buffer status, response = callServer('put', serverEndpoint, '/rmeta/text', string, headers, False, config_path=config_path, requestOptions=requestOptions) tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:09:08,174 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:13,174 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:18,175 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:23,175 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:09:23,175 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _______________________________________________________________________________________________________________________________ test_parser_buffer_zlib_input_and_gzip_output ________________________________________________________________________________________________________________________________ benchmark = def test_parser_buffer_zlib_input_and_gzip_output(benchmark): """example how to send gzip file""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_buffer_zlib, file, headers={'Accept-Encoding': 'gzip, deflate'}) tika/tests/test_benchmark.py:82: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:97: in tika_from_buffer_zlib return tika.parser.from_buffer(zlib.compress(file_obj.read()), headers=headers) tika/parser.py:65: in from_buffer status, response = callServer('put', serverEndpoint, '/rmeta/text', string, headers, False, config_path=config_path, requestOptions=requestOptions) tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:09:23,295 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:28,296 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:33,296 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:38,297 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:09:38,297 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _______________________________________________________________________________________________________________________________ test_parser_buffer_gzip_input_and_gzip_output ________________________________________________________________________________________________________________________________ benchmark = def test_parser_buffer_gzip_input_and_gzip_output(benchmark): """parse file binary""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > response = benchmark(tika_from_buffer_gzip, file, headers={'Accept-Encoding': 'gzip, deflate'}) tika/tests/test_benchmark.py:90: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:125: in __call__ return self._raw(function_to_benchmark, *args, **kwargs) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:147: in _raw duration, iterations, loops_range = self._calibrate_timer(runner) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:275: in _calibrate_timer duration = runner(loops_range) /usr/lib/python3.12/site-packages/pytest_benchmark/fixture.py:90: in runner function_to_benchmark(*args, **kwargs) tika/tests/test_benchmark.py:102: in tika_from_buffer_gzip return tika.parser.from_buffer(gzip_compress(file_obj.read()), headers=headers) tika/parser.py:65: in from_buffer status, response = callServer('put', serverEndpoint, '/rmeta/text', string, headers, False, config_path=config_path, requestOptions=requestOptions) tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:09:38,416 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:43,416 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:48,417 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:53,417 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:09:53,417 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ______________________________________________________________________________________________________________________________________ CreateTest.test_default_service _______________________________________________________________________________________________________________________________________ self = def test_default_service(self): 'parse file using default service' > result = tika.parser.from_file( 'https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf') tika/tests/test_from_file_service.py:34: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:09:53,441 [MainThread ] [INFO ] Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. 2025-10-13 05:09:53,939 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:09:58,939 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:03,939 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:08,939 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:10:08,940 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. __________________________________________________________________________________________________________________________________ CreateTest.test_default_service_explicit __________________________________________________________________________________________________________________________________ self = def test_default_service_explicit(self): 'parse file using default service explicitly' > result = tika.parser.from_file( 'https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf', service='all') tika/tests/test_from_file_service.py:49: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:10:08,958 [MainThread ] [INFO ] Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. 2025-10-13 05:10:09,465 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:14,465 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:19,465 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:24,465 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:10:24,466 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ______________________________________________________________________________________________________________________________________ CreateTest.test_invalid_service _______________________________________________________________________________________________________________________________________ self = def test_invalid_service(self): 'parse file using an invalid service should perform the default parsing' > result = tika.parser.from_file( 'https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf', service='bad') tika/tests/test_from_file_service.py:67: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:10:24,484 [MainThread ] [INFO ] Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. 2025-10-13 05:10:24,889 [MainThread ] [WARNI] config option must be one of meta, text, or all; using all. 2025-10-13 05:10:24,982 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:29,983 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:34,983 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:39,983 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:10:39,983 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. WARNING tika.tika:tika.py:332 config option must be one of meta, text, or all; using all. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ________________________________________________________________________________________________________________________________________ CreateTest.test_meta_service ________________________________________________________________________________________________________________________________________ self = def test_meta_service(self): 'parse file using the content only service' > result = tika.parser.from_file( 'https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf', service='meta') tika/tests/test_from_file_service.py:61: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:10:40,001 [MainThread ] [INFO ] Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. 2025-10-13 05:10:40,500 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:45,501 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:50,501 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:10:55,501 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:10:55,501 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ________________________________________________________________________________________________________________________________________ CreateTest.test_text_service ________________________________________________________________________________________________________________________________________ self = def test_text_service(self): 'parse file using the content only service' > result = tika.parser.from_file( 'https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf', service='text') tika/tests/test_from_file_service.py:55: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:10:55,520 [MainThread ] [INFO ] Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. 2025-10-13 05:10:56,024 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:01,024 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:06,025 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:11,025 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:11:11,025 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving https://boe.es/boe/dias/2019/12/02/pdfs/BOE-A-2019-17288.pdf to /tmp/boe-dias-2019-12-02-pdfs-boe-a-2019-17288.pdf. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ________________________________________________________________________________________________________________________________________ CreateTest.test_kill_server _________________________________________________________________________________________________________________________________________ self = def test_kill_server(self): """parse some file then kills server""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') with open(file, 'rb') as file_obj: > tika.parser.from_file(file_obj) tika/tests/test_tika.py:68: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:11:11,136 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:16,136 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:21,137 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:26,137 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:11:26,137 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ________________________________________________________________________________________________________________________________________ CreateTest.test_local_binary ________________________________________________________________________________________________________________________________________ self = def test_local_binary(self): """parse file binary""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') with open(file, 'rb') as file_obj: > self.assertTrue(tika.parser.from_file(file_obj)) tika/tests/test_tika.py:53: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:11:26,248 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:31,249 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:36,249 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:41,249 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:11:41,249 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ________________________________________________________________________________________________________________________________________ CreateTest.test_local_buffer ________________________________________________________________________________________________________________________________________ self = def test_local_buffer(self): > response = tika.parser.from_buffer('Good evening, Dave') tika/tests/test_tika.py:56: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:65: in from_buffer status, response = callServer('put', serverEndpoint, '/rmeta/text', string, headers, False, config_path=config_path, requestOptions=requestOptions) tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:11:41,360 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:46,360 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:51,361 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:11:56,361 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:11:56,361 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _________________________________________________________________________________________________________________________________________ CreateTest.test_local_path _________________________________________________________________________________________________________________________________________ self = def test_local_path(self): """parse file path""" file = os.path.join(os.path.dirname(__file__), 'files', 'rwservlet.pdf') > self.assertTrue(tika.parser.from_file(file)) tika/tests/test_tika.py:62: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:11:56,469 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:12:01,470 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:12:06,470 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:12:11,470 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:12:11,470 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ________________________________________________________________________________________________________________________________________ CreateTest.test_remote_html _________________________________________________________________________________________________________________________________________ self = def test_remote_html(self): """parse remote HTML""" > self.assertTrue(tika.parser.from_file('http://neverssl.com/index.html')) tika/tests/test_tika.py:37: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:12:11,488 [MainThread ] [INFO ] Retrieving http://neverssl.com/index.html to /tmp/index.html. 2025-10-13 05:15:02,553 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:07,553 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:12,553 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:17,554 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:15:17,554 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving http://neverssl.com/index.html to /tmp/index.html. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _________________________________________________________________________________________________________________________________________ CreateTest.test_remote_mp3 _________________________________________________________________________________________________________________________________________ self = def test_remote_mp3(self): """parse remote mp3""" > self.assertTrue(tika.parser.from_file( 'https://archive.org/download/Ainst-Spaceshipdemo.mp3/Ainst-Spaceshipdemo.mp3')) tika/tests/test_tika.py:41: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:15:17,572 [MainThread ] [INFO ] Retrieving https://archive.org/download/Ainst-Spaceshipdemo.mp3/Ainst-Spaceshipdemo.mp3 to /tmp/download-ainst-spaceshipdemo.mp3-ainst-spaceshipdemo.mp3. 2025-10-13 05:15:21,533 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:26,533 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:31,534 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:36,534 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:15:36,534 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving https://archive.org/download/Ainst-Spaceshipdemo.mp3/Ainst-Spaceshipdemo.mp3 to /tmp/download-ainst-spaceshipdemo.mp3-ainst-spaceshipdemo.mp3. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. _________________________________________________________________________________________________________________________________________ CreateTest.test_remote_pdf _________________________________________________________________________________________________________________________________________ self = def test_remote_pdf(self): """parse remote PDF""" > self.assertTrue(tika.parser.from_file( 'http://appsrv.achd.net/reports/rwservlet?food_rep_insp&P_ENCOUNTER=201504160015')) tika/tests/test_tika.py:32: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tika/parser.py:40: in from_file output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions) tika/tika.py:337: in parse1 status, response = callServer('put', serverEndpoint, service, f, tika/tika.py:532: in callServer serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ scheme = 'http', serverHost = 'localhost', port = 9998, tikaServerJar = 'http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server-standard/3.1.0/tika-server-standard-3.1.0.jar', classpath = '', config_path = None def checkTikaServer(scheme="http", serverHost=ServerHost, port=Port, tikaServerJar=TikaServerJar, classpath=None, config_path=None): ''' Check that tika-server is running. If not, download JAR file and start it up. :param scheme: e.g. http or https :param serverHost: :param port: :param tikaServerJar: :param classpath: :return: ''' if classpath is None: classpath = TikaServerClasspath if port is None: port = '443' if scheme == 'https' else '80' urlp = urlparse(tikaServerJar) serverEndpoint = '%s://%s:%s' % (scheme, serverHost, port) jarPath = os.path.join(TikaJarPath, 'tika-server.jar') if 'localhost' in serverEndpoint or '127.0.0.1' in serverEndpoint: alreadyRunning = checkPortIsOpen(serverHost, port) if not alreadyRunning: if not os.path.isfile(jarPath) and urlp.scheme != '': getRemoteJar(tikaServerJar, jarPath) if not checkJarSig(tikaServerJar, jarPath): os.remove(jarPath) tikaServerJar = getRemoteJar(tikaServerJar, jarPath) status = startServer(jarPath, TikaJava, TikaJavaArgs, serverHost, port, classpath, config_path) if not status: log.error("Failed to receive startup confirmation from startServer.") > raise RuntimeError("Unable to start Tika server.") E RuntimeError: Unable to start Tika server. tika/tika.py:602: RuntimeError -------------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------------------------------- 2025-10-13 05:15:36,552 [MainThread ] [INFO ] Retrieving http://appsrv.achd.net/reports/rwservlet?food_rep_insp&P_ENCOUNTER=201504160015 to /tmp/reports-rwservlet. 2025-10-13 05:15:42,273 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:47,274 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:52,274 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2025-10-13 05:15:57,274 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2025-10-13 05:15:57,274 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. --------------------------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------------------------- INFO tika.tika:tika.py:776 Retrieving http://appsrv.achd.net/reports/rwservlet?food_rep_insp&P_ENCOUNTER=201504160015 to /tmp/reports-rwservlet. WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... WARNING tika.tika:tika.py:695 Failed to see startup log message; retrying... ERROR tika.tika:tika.py:700 Tika startup log message not received after 3 tries. ERROR tika.tika:tika.py:601 Failed to receive startup confirmation from startServer. ============================================================================================================================================== warnings summary ============================================================================================================================================== tika/__init__.py:20 /home/udu/aports/community/py3-tika/src/tika-python-3.1.0/tika/__init__.py:20: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. __import__('pkg_resources').declare_namespace(__name__) tika/__init__.py:20 /home/udu/aports/community/py3-tika/src/tika-python-3.1.0/tika/__init__.py:20: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('tika')`. Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages __import__('pkg_resources').declare_namespace(__name__) tika/tests/__init__.py:18 /home/udu/aports/community/py3-tika/src/tika-python-3.1.0/tika/tests/__init__.py:18: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('tika.tests')`. Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages __import__('pkg_resources').declare_namespace(__name__) ../../../../../../../usr/lib/python3.12/site-packages/pkg_resources/__init__.py:2558 /usr/lib/python3.12/site-packages/pkg_resources/__init__.py:2558: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('tika')`. Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(parent) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ========================================================================================================================================== short test summary info =========================================================================================================================================== FAILED tika/tests/test_benchmark.py::test_local_binary - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_benchmark.py::test_parser_buffer - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_benchmark.py::test_parser_buffer_zlib_input - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_benchmark.py::test_parser_buffer_gzip_input - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_benchmark.py::test_local_binary_with_gzip_output - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_benchmark.py::test_parser_buffer_with_gzip_output - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_benchmark.py::test_parser_buffer_zlib_input_and_gzip_output - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_benchmark.py::test_parser_buffer_gzip_input_and_gzip_output - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_from_file_service.py::CreateTest::test_default_service - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_from_file_service.py::CreateTest::test_default_service_explicit - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_from_file_service.py::CreateTest::test_invalid_service - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_from_file_service.py::CreateTest::test_meta_service - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_from_file_service.py::CreateTest::test_text_service - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_tika.py::CreateTest::test_kill_server - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_tika.py::CreateTest::test_local_binary - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_tika.py::CreateTest::test_local_buffer - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_tika.py::CreateTest::test_local_path - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_tika.py::CreateTest::test_remote_html - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_tika.py::CreateTest::test_remote_mp3 - RuntimeError: Unable to start Tika server. FAILED tika/tests/test_tika.py::CreateTest::test_remote_pdf - RuntimeError: Unable to start Tika server. ===================================================================================================================== 20 failed, 1 passed, 2 deselected, 4 warnings in 487.34s (0:08:07) =====================================================================================================================