2899: RFC: Provide a special purpose FromPyObject impl for byte slices r=davidhewitt a=adamreichold

This enables efficiently and safely getting a byte slice from either bytes or byte arrays.

The main issue I see here is discoverability, i.e. should this be mention in the docs of `PyBytes` and `PyByteArray` or in the guide?

It is also not completely clear whether this really _fixes_ the issue.

Closes #2888 

Co-authored-by: Adam Reichold <adam.reichold@t-online.de>
This commit is contained in:
bors[bot] 2023-02-22 22:13:56 +00:00 committed by GitHub
commit bbaceb88b4
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 61 additions and 6 deletions

View File

@ -14,7 +14,7 @@ The table below contains the Python type and the corresponding function argument
| ------------- |:-------------------------------:|:--------------------:|
| `object` | - | `&PyAny` |
| `str` | `String`, `Cow<str>`, `&str`, `OsString`, `PathBuf` | `&PyUnicode` |
| `bytes` | `Vec<u8>`, `&[u8]` | `&PyBytes` |
| `bytes` | `Vec<u8>`, `&[u8]`, `Cow<[u8]>` | `&PyBytes` |
| `bool` | `bool` | `&PyBool` |
| `int` | Any integer type (`i32`, `u32`, `usize`, etc) | `&PyLong` |
| `float` | `f32`, `f64` | `&PyFloat` |
@ -24,7 +24,7 @@ The table below contains the Python type and the corresponding function argument
| `tuple[T, U]` | `(T, U)`, `Vec<T>` | `&PyTuple` |
| `set[T]` | `HashSet<T>`, `BTreeSet<T>`, `hashbrown::HashSet<T>`[^2] | `&PySet` |
| `frozenset[T]` | `HashSet<T>`, `BTreeSet<T>`, `hashbrown::HashSet<T>`[^2] | `&PyFrozenSet` |
| `bytearray` | `Vec<u8>` | `&PyByteArray` |
| `bytearray` | `Vec<u8>`, `Cow<[u8]>` | `&PyByteArray` |
| `slice` | - | `&PySlice` |
| `type` | - | `&PyType` |
| `module` | - | `&PyModule` |
@ -84,6 +84,7 @@ Finally, the following Rust types are also able to convert to Python as return v
| `Option<T>` | `Optional[T]` |
| `(T, U)` | `Tuple[T, U]` |
| `Vec<T>` | `List[T]` |
| `Cow<[u8]>` | `bytes` |
| `HashMap<K, V>` | `Dict[K, V]` |
| `BTreeMap<K, V>` | `Dict[K, V]` |
| `HashSet<T>` | `Set[T]` |

View File

@ -0,0 +1 @@
Add special-purpose impl of `FromPyObject` for `Cow<[u8]>` to efficiently handle both `bytes` and `bytearray` objects

View File

@ -1,9 +1,12 @@
use crate::{ffi, AsPyPointer, Py, PyAny, PyResult, Python};
use crate::{ffi, AsPyPointer, FromPyObject, IntoPy, Py, PyAny, PyResult, Python, ToPyObject};
use std::borrow::Cow;
use std::ops::Index;
use std::os::raw::c_char;
use std::slice::SliceIndex;
use std::str;
use super::bytearray::PyByteArray;
/// Represents a Python `bytes` object.
///
/// This type is immutable.
@ -121,11 +124,37 @@ impl<I: SliceIndex<[u8]>> Index<I> for PyBytes {
}
}
/// Special-purpose trait impl to efficiently handle both `bytes` and `bytearray`
///
/// If the source object is a `bytes` object, the `Cow` will be borrowed and
/// pointing into the source object, and no copying or heap allocations will happen.
/// If it is a `bytearray`, its contents will be copied to an owned `Cow`.
impl<'source> FromPyObject<'source> for Cow<'source, [u8]> {
fn extract(ob: &'source PyAny) -> PyResult<Self> {
if let Ok(bytes) = ob.downcast::<PyBytes>() {
return Ok(Cow::Borrowed(bytes.as_bytes()));
}
let byte_array = ob.downcast::<PyByteArray>()?;
Ok(Cow::Owned(byte_array.to_vec()))
}
}
impl ToPyObject for Cow<'_, [u8]> {
fn to_object(&self, py: Python<'_>) -> Py<PyAny> {
PyBytes::new(py, self.as_ref()).into()
}
}
impl IntoPy<Py<PyAny>> for Cow<'_, [u8]> {
fn into_py(self, py: Python<'_>) -> Py<PyAny> {
self.to_object(py)
}
}
#[cfg(test)]
mod tests {
use super::PyBytes;
use crate::FromPyObject;
use crate::Python;
use super::*;
#[test]
fn test_bytes_index() {
@ -172,4 +201,28 @@ mod tests {
.is_instance_of::<PyValueError>(py));
});
}
#[test]
fn test_cow_impl() {
Python::with_gil(|py| {
let bytes = py.eval(r#"b"foobar""#, None, None).unwrap();
let cow = bytes.extract::<Cow<'_, [u8]>>().unwrap();
assert_eq!(cow, Cow::<[u8]>::Borrowed(b"foobar"));
let byte_array = py.eval(r#"bytearray(b"foobar")"#, None, None).unwrap();
let cow = byte_array.extract::<Cow<'_, [u8]>>().unwrap();
assert_eq!(cow, Cow::<[u8]>::Owned(b"foobar".to_vec()));
let something_else_entirely = py.eval("42", None, None).unwrap();
something_else_entirely
.extract::<Cow<'_, [u8]>>()
.unwrap_err();
let cow = Cow::<[u8]>::Borrowed(b"foobar").to_object(py);
assert!(cow.as_ref(py).is_instance_of::<PyBytes>().unwrap());
let cow = Cow::<[u8]>::Owned(b"foobar".to_vec()).to_object(py);
assert!(cow.as_ref(py).is_instance_of::<PyBytes>().unwrap());
});
}
}