Descriptor Reader¶
Utilities for reading descriptors from local directories and archives. This is
mostly done through the DescriptorReader
class, which is an iterator for the descriptor data in a series of
destinations. For example…
my_descriptors = [
'/tmp/server-descriptors-2012-03.tar.bz2',
'/tmp/archived_descriptors/',
]
# prints the contents of all the descriptor files
with DescriptorReader(my_descriptors) as reader:
for descriptor in reader:
print descriptor
This ignores files that cannot be processed due to read errors or unparsable
content. To be notified of skipped files you can register a listener with
register_skip_listener()
.
The DescriptorReader
keeps track of the last
modified timestamps for descriptor files that it has read so it can skip
unchanged files if run again. This listing of processed files can also be
persisted and applied to other
DescriptorReader
instances. For example, the
following prints descriptors as they’re changed over the course of a minute,
and picks up where it left off if run again…
reader = DescriptorReader(['/tmp/descriptor_data'])
try:
processed_files = load_processed_files('/tmp/used_descriptors')
reader.set_processed_files(processed_files)
except: pass # could not load, maybe this is the first run
start_time = time.time()
while (time.time() - start_time) < 60:
# prints any descriptors that have changed since last checked
with reader:
for descriptor in reader:
print descriptor
time.sleep(1)
save_processed_files('/tmp/used_descriptors', reader.get_processed_files())
Module Overview:
load_processed_files - Loads a listing of processed files
save_processed_files - Saves a listing of processed files
DescriptorReader - Iterator for descriptor data on the local file system
|- get_processed_files - provides the listing of files that we've processed
|- set_processed_files - sets our tracking of the files we have processed
|- register_read_listener - adds a listener for when files are read
|- register_skip_listener - adds a listener that's notified of skipped files
|- start - begins reading descriptor data
|- stop - stops reading descriptor data
|- __enter__ / __exit__ - manages the descriptor reader thread in the context
+- __iter__ - iterates over descriptor data in unread files
FileSkipped - Base exception for a file that was skipped
|- AlreadyRead - We've already read a file with this last modified timestamp
|- ParsingFailure - Contents can't be parsed as descriptor data
|- UnrecognizedType - File extension indicates non-descriptor data
+- ReadFailed - Wraps an error that was raised while reading the file
+- FileMissing - File does not exist
Deprecated since version 1.8.0: This module will likely be removed in Stem 2.0 due to lack of usage. If you use this modle please let me know.