For DataStreams Source’s and Sink’s, do all the methods listed in the documentation (linked below) need to be implemented? For instance, for a source that is sequential, reset! is often meaningless.
All the interface methods should list whether they’re required or not; definitely open an issue if something’s not clear. In terms of
Data.reset!, it certainly can be useful for sequential sources; take
CSV.Source, for example; it holds an internal
IO object that represents the underlying csv file. It also marks the
datapos, which is the byte offset in the file where the actual table data starts, so
Data.reset! is defined simply as
Another case is
SQLite.Source: the data must be accessed row-by-row, so it’s not
RandomAccess, but it also can be
Data.reset! because it just involves resetting the result cursor back to row 1 of the resultset.
Thanks. I’ll open an issue in github. I didn’t mean to imply that reset! is never useful (though perhaps my words “often meaningless” were a bit strong). My point is more that they are sometimes not useful. I have in mind a true “streaming” data set such as trade and quote data being broadcast over the network. There is usually no way to seek on this type of data. Anyway, further discussion can take place in github.
I have another question that I wouldn’t call an “issue”: it says in the documentation “Packages can have a single julia type implement both the Data.Source and Data.Sink interfaces, or two separate types can implement them separately.”. My understanding is that such types should inherit from the abstract types but that julia doesn’t have multiple inheritance so I don’t understand how a type can inherit from both.
The example I have in mind is a type that takes a stream and filters it to return a lower bandwidth stream.