Move all text functionality toplevel
This commit is contained in:
parent
de8d82ab63
commit
844769ae19
10 changed files with 96 additions and 118 deletions
|
|
@ -1,4 +1,4 @@
|
||||||
use similar::text::get_close_matches;
|
use similar::get_close_matches;
|
||||||
|
|
||||||
fn main() {
|
fn main() {
|
||||||
let words = vec![
|
let words = vec![
|
||||||
|
|
|
||||||
|
|
@ -3,8 +3,7 @@ use std::fs::read;
|
||||||
use std::process::exit;
|
use std::process::exit;
|
||||||
|
|
||||||
use console::{style, Style};
|
use console::{style, Style};
|
||||||
use similar::text::TextDiff;
|
use similar::{ChangeTag, TextDiff};
|
||||||
use similar::ChangeTag;
|
|
||||||
|
|
||||||
struct Line(Option<usize>);
|
struct Line(Option<usize>);
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,5 @@
|
||||||
use console::Style;
|
use console::Style;
|
||||||
use similar::text::TextDiff;
|
use similar::{ChangeTag, TextDiff};
|
||||||
use similar::ChangeTag;
|
|
||||||
|
|
||||||
fn main() {
|
fn main() {
|
||||||
let diff = TextDiff::from_lines(
|
let diff = TextDiff::from_lines(
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@ use std::fs::read;
|
||||||
use std::io;
|
use std::io;
|
||||||
use std::process::exit;
|
use std::process::exit;
|
||||||
|
|
||||||
use similar::text::TextDiff;
|
use similar::TextDiff;
|
||||||
|
|
||||||
fn main() {
|
fn main() {
|
||||||
let args: Vec<_> = std::env::args_os().collect();
|
let args: Vec<_> = std::env::args_os().collect();
|
||||||
|
|
|
||||||
91
src/lib.rs
91
src/lib.rs
|
|
@ -4,8 +4,7 @@
|
||||||
//!
|
//!
|
||||||
//! ```rust
|
//! ```rust
|
||||||
//! # #[cfg(feature = "text")] {
|
//! # #[cfg(feature = "text")] {
|
||||||
//! use similar::ChangeTag;
|
//! use similar::{ChangeTag, TextDiff};
|
||||||
//! use similar::text::TextDiff;
|
|
||||||
//!
|
//!
|
||||||
//! let diff = TextDiff::from_lines(
|
//! let diff = TextDiff::from_lines(
|
||||||
//! "Hello World\nThis is the second line.\nThis is the third.",
|
//! "Hello World\nThis is the second line.\nThis is the third.",
|
||||||
|
|
@ -25,38 +24,108 @@
|
||||||
//! # }
|
//! # }
|
||||||
//! ```
|
//! ```
|
||||||
//!
|
//!
|
||||||
//! ## Functionality
|
//! # API
|
||||||
|
//!
|
||||||
|
//! The API of the crate is split into high and low level functionality. Most
|
||||||
|
//! of what you probably want to use is available toplevel. Additionally the
|
||||||
|
//! following sub modules exist:
|
||||||
//!
|
//!
|
||||||
//! * [`algorithms`]: This implements the different types of diffing algorithms.
|
//! * [`algorithms`]: This implements the different types of diffing algorithms.
|
||||||
//! It provides both low level access to the algorithms with the minimal
|
//! It provides both low level access to the algorithms with the minimal
|
||||||
//! trait bounds necessary, as well as a generic interface.
|
//! trait bounds necessary, as well as a generic interface.
|
||||||
//! * [`text`]: This extends the general diffing functionality to text (and more
|
//! * [`udiff`]: Unified diff functionality.
|
||||||
//! specifically line) based diff operations.
|
|
||||||
//!
|
//!
|
||||||
//! ## Features
|
//! # Sequence Diffing
|
||||||
|
//!
|
||||||
|
//! If you want to diff sequences generally indexable things you can use the
|
||||||
|
//! [`capture_diff`] and [`capture_diff_slices`] functions. They will directly
|
||||||
|
//! diff an indexable object or slice and return a vector of [`DiffOp`] objects.
|
||||||
|
//!
|
||||||
|
//! # Text Diffing
|
||||||
|
//!
|
||||||
|
//! Similar provides helpful utilities for text (and more specifically line) diff
|
||||||
|
//! operations. The main type you want to work with is [`TextDiff`] which
|
||||||
|
//! uses the underlying diff algorithms to expose a convenient API to work with
|
||||||
|
//! texts.
|
||||||
|
//!
|
||||||
|
//! ## Trailing Newlines
|
||||||
|
//!
|
||||||
|
//! When working with line diffs (and unified diffs in general) there are two
|
||||||
|
//! "philosophies" to look at lines. One is to diff lines without their newline
|
||||||
|
//! character, the other is to diff with the newline character. Typically the
|
||||||
|
//! latter is done because text files do not _have_ to end in a newline character.
|
||||||
|
//! As a result there is a difference between `foo\n` and `foo` as far as diffs
|
||||||
|
//! are concerned.
|
||||||
|
//!
|
||||||
|
//! In similar this is handled on the [`Change`] or [`InlineChange`] level. If
|
||||||
|
//! a diff was created via [`TextDiff::from_lines`] the text diffing system is
|
||||||
|
//! instructed to check if there are missing newlines encountered. If that is
|
||||||
|
//! the case the [`Change`] object will return true from the
|
||||||
|
//! [`Change::missing_newline`] method so the caller knows to handle this by
|
||||||
|
//! either rendering a virtual newline at that position or to indicate it in
|
||||||
|
//! different ways. For instance the unified diff code will render the special
|
||||||
|
//! `\ No newline at end of file` marker.
|
||||||
|
//!
|
||||||
|
//! ## Bytes vs Unicode
|
||||||
|
//!
|
||||||
|
//! Similar module concerns itself with a loser definition of "text" than you would
|
||||||
|
//! normally see in Rust. While by default it can only operate on [`str`] types
|
||||||
|
//! by enabling the `bytes` feature it gains support for byte slices with some
|
||||||
|
//! caveats.
|
||||||
|
//!
|
||||||
|
//! A lot of text diff functionality assumes that what is being diffed constiutes
|
||||||
|
//! text, but in the real world it can often be challenging to ensure that this is
|
||||||
|
//! all valid utf-8. Because of this the crate is built so that most functinality
|
||||||
|
//! also still works with bytes for as long as they are roughtly ASCII compatible.
|
||||||
|
//!
|
||||||
|
//! This means you will be successful in creating a unified diff from latin1
|
||||||
|
//! encoded bytes but if you try to do the same with EBCDIC encoded bytes you
|
||||||
|
//! will only get garbage.
|
||||||
|
//!
|
||||||
|
//! # Ops vs Changes
|
||||||
|
//!
|
||||||
|
//! Because very commonly two compared sequences will largely match this module
|
||||||
|
//! splits it's functionality into two layers:
|
||||||
|
//!
|
||||||
|
//! Changes are encoded as [diff operations](crate::DiffOp). These are
|
||||||
|
//! ranges of the differences by index in the source sequence. Because this
|
||||||
|
//! can be cumbersome to work with a separate method [`DiffOp::iter_changes`]
|
||||||
|
//! (and [`TextDiff::iter_changes`] when working with text diffs) is provided
|
||||||
|
//! which expands all the changes on an item by item level encoded in an operation.
|
||||||
|
//!
|
||||||
|
//! As the [`TextDiff::grouped_ops`] method can isolate clusters of changes
|
||||||
|
//! this even works for very long files if paired with this method.
|
||||||
|
//!
|
||||||
|
//! # Feature Flags
|
||||||
//!
|
//!
|
||||||
//! The crate by default does not have any dependencies however for some use
|
//! The crate by default does not have any dependencies however for some use
|
||||||
//! cases it's useful to pull in extra functionality. Likewise you can turn
|
//! cases it's useful to pull in extra functionality. Likewise you can turn
|
||||||
//! off some functionality.
|
//! off some functionality.
|
||||||
//!
|
//!
|
||||||
//! * `text`: this feature is enabled by default and enables the [`text`] module.
|
//! * `text`: this feature is enabled by default and enables the text based
|
||||||
|
//! diffing types such as [`TextDiff`].
|
||||||
//! If the crate is used without default features it's removed.
|
//! If the crate is used without default features it's removed.
|
||||||
//! * `unicode`: when this feature is enabled the text diffing functionality
|
//! * `unicode`: when this feature is enabled the text diffing functionality
|
||||||
//! gains the ability to diff on a grapheme instead of character level. This
|
//! gains the ability to diff on a grapheme instead of character level. This
|
||||||
//! is particularly useful when working with text containing emojis. This
|
//! is particularly useful when working with text containing emojis. This
|
||||||
//! pulls in some relatively complex dependencies for working with the unicode
|
//! pulls in some relatively complex dependencies for working with the unicode
|
||||||
//! database.
|
//! database.
|
||||||
//! * `bytes`: this feature adds support for working with byte slices in the
|
//! * `bytes`: this feature adds support for working with byte slices in text
|
||||||
//! [`text`] module in addition to unicode strings. This pulls in the
|
//! APIs in addition to unicode strings. This pulls in the
|
||||||
//! [`bstr`] dependency.
|
//! [`bstr`] dependency.
|
||||||
//! * `inline`: this feature gives access to additional functionality of the
|
//! * `inline`: this feature gives access to additional functionality of the
|
||||||
//! [`text`] module to provide inline information about which values changed
|
//! text diffing to provide inline information about which values changed
|
||||||
//! in a line diff. This currently also enables the `unicode` feature.
|
//! in a line diff. This currently also enables the `unicode` feature.
|
||||||
#![warn(missing_docs)]
|
#![warn(missing_docs)]
|
||||||
pub mod algorithms;
|
pub mod algorithms;
|
||||||
pub mod text;
|
pub mod udiff;
|
||||||
|
|
||||||
mod common;
|
mod common;
|
||||||
|
#[cfg(feature = "text")]
|
||||||
|
mod text;
|
||||||
mod types;
|
mod types;
|
||||||
|
|
||||||
pub use self::common::*;
|
pub use self::common::*;
|
||||||
|
#[cfg(feature = "text")]
|
||||||
|
pub use self::text::*;
|
||||||
pub use self::types::*;
|
pub use self::types::*;
|
||||||
|
|
|
||||||
|
|
@ -1,5 +1,5 @@
|
||||||
---
|
---
|
||||||
source: src/text/udiff.rs
|
source: src/udiff.rs
|
||||||
expression: "&diff.unified_diff().header(\"a.txt\", \"b.txt\").to_string()"
|
expression: "&diff.unified_diff().header(\"a.txt\", \"b.txt\").to_string()"
|
||||||
---
|
---
|
||||||
--- a.txt
|
--- a.txt
|
||||||
|
|
@ -1,5 +1,5 @@
|
||||||
---
|
---
|
||||||
source: src/text/udiff.rs
|
source: src/udiff.rs
|
||||||
expression: "&diff.unified_diff().missing_newline_hint(false).header(\"a.txt\",\n \"b.txt\").to_string()"
|
expression: "&diff.unified_diff().missing_newline_hint(false).header(\"a.txt\",\n \"b.txt\").to_string()"
|
||||||
---
|
---
|
||||||
--- a.txt
|
--- a.txt
|
||||||
|
|
@ -1,5 +1,5 @@
|
||||||
---
|
---
|
||||||
source: src/text/udiff.rs
|
source: src/udiff.rs
|
||||||
expression: "&diff.unified_diff().header(\"a.txt\", \"b.txt\").to_string()"
|
expression: "&diff.unified_diff().header(\"a.txt\", \"b.txt\").to_string()"
|
||||||
---
|
---
|
||||||
--- a.txt
|
--- a.txt
|
||||||
|
|
@ -1,90 +1,4 @@
|
||||||
//! Text diffing utilities.
|
//! Text diffing utilities.
|
||||||
//!
|
|
||||||
//! This provides helpful utilities for text (and more specifically line) diff
|
|
||||||
//! operations. The main type you want to work with is [`TextDiff`] which
|
|
||||||
//! uses the underlying diff algorithms to expose a convenient API to work with
|
|
||||||
//! texts.
|
|
||||||
//!
|
|
||||||
//! It can produce a unified diff and also let you iterate over the changeset
|
|
||||||
//! directly if you want.
|
|
||||||
//!
|
|
||||||
//! Text diffing is available by default but can be disabled by turning off the
|
|
||||||
//! default features. The feature to enable to get it back is `text`.
|
|
||||||
//!
|
|
||||||
//! # Examples
|
|
||||||
//!
|
|
||||||
//! A super simple example for how to generate a unified diff with three lines
|
|
||||||
//! off context around the changes:
|
|
||||||
//!
|
|
||||||
//! ```rust
|
|
||||||
//! # use similar::text::TextDiff;
|
|
||||||
//! # let old_text = "";
|
|
||||||
//! # let new_text = "";
|
|
||||||
//! let diff = TextDiff::from_lines(old_text, new_text);
|
|
||||||
//! let unified_diff = diff.unified_diff().header("old_file", "new_file").to_string();
|
|
||||||
//! ```
|
|
||||||
//!
|
|
||||||
//! This is another example that iterates over the actual changes:
|
|
||||||
//!
|
|
||||||
//! ```rust
|
|
||||||
//! # use similar::text::TextDiff;
|
|
||||||
//! # let old_text = "";
|
|
||||||
//! # let new_text = "";
|
|
||||||
//! let diff = TextDiff::from_lines(old_text, new_text);
|
|
||||||
//! for op in diff.ops() {
|
|
||||||
//! for change in diff.iter_changes(op) {
|
|
||||||
//! println!("{:?}", change);
|
|
||||||
//! }
|
|
||||||
//! }
|
|
||||||
//! ```
|
|
||||||
//!
|
|
||||||
//! # Ops vs Changes
|
|
||||||
//!
|
|
||||||
//! Because very commonly two compared sequences will largely match this module
|
|
||||||
//! splits it's functionality into two layers. The first is inherited from the
|
|
||||||
//! general [`algorithms`](crate::algorithms) module: changes are encoded as
|
|
||||||
//! [diff operations](crate::DiffOp). These are ranges of the
|
|
||||||
//! differences by index in the source sequence. Because this can be cumbersome
|
|
||||||
//! to work with a separate method [`TextDiff::iter_changes`] is provided which
|
|
||||||
//! expands all the changes on an item by item level encoded in an operation.
|
|
||||||
//!
|
|
||||||
//! Because the [`TextDiff::grouped_ops`] method can isolate clusters of changes
|
|
||||||
//! this even works for very long files if paired with this method.
|
|
||||||
//!
|
|
||||||
//! # Trailing Newlines
|
|
||||||
//!
|
|
||||||
//! When working with line diffs (and unified diffs in general) there are two
|
|
||||||
//! "philosophies" to look at lines. One is to diff lines without their newline
|
|
||||||
//! character, the other is to diff with the newline character. Typically the
|
|
||||||
//! latter is done because text files do not _have_ to end in a newline character.
|
|
||||||
//! As a result there is a difference between `foo\n` and `foo` as far as diffs
|
|
||||||
//! are concerned.
|
|
||||||
//!
|
|
||||||
//! In similar this is handled on the [`Change`] or [`InlineChange`] level. If
|
|
||||||
//! a diff was created via [`TextDiff::from_lines`] the text diffing system is
|
|
||||||
//! instructed to check if there are missing newlines encountered. If that is
|
|
||||||
//! the case the [`Change`] object will return true from the
|
|
||||||
//! [`Change::missing_newline`] method so the caller knows to handle this by
|
|
||||||
//! either rendering a virtual newline at that position or to indicate it in
|
|
||||||
//! different ways. For instance the unified diff code will render the special
|
|
||||||
//! `\ No newline at end of file` marker.
|
|
||||||
//!
|
|
||||||
//! # Bytes vs Unicode
|
|
||||||
//!
|
|
||||||
//! This module concerns itself with a loser definition of "text" than you would
|
|
||||||
//! normally see in Rust. While by default it can only operate on [`str`] types
|
|
||||||
//! by enabling the `bytes` feature it gains support for byte slices with some
|
|
||||||
//! caveats.
|
|
||||||
//!
|
|
||||||
//! A lot of text diff functionality assumes that what is being diffed constiutes
|
|
||||||
//! text, but in the real world it can often be challenging to ensure that this is
|
|
||||||
//! all valid utf-8. Because of this the crate is built so that most functinality
|
|
||||||
//! also still works with bytes for as long as they are roughtly ASCII compatible.
|
|
||||||
//!
|
|
||||||
//! This means you will be successful in creating a unified diff from latin1
|
|
||||||
//! encoded bytes but if you try to do the same with EBCDIC encoded bytes you
|
|
||||||
//! will only get garbage.
|
|
||||||
#![cfg(feature = "text")]
|
|
||||||
use std::borrow::Cow;
|
use std::borrow::Cow;
|
||||||
use std::cmp::Reverse;
|
use std::cmp::Reverse;
|
||||||
use std::collections::BinaryHeap;
|
use std::collections::BinaryHeap;
|
||||||
|
|
@ -92,15 +6,14 @@ use std::collections::BinaryHeap;
|
||||||
mod abstraction;
|
mod abstraction;
|
||||||
#[cfg(feature = "inline")]
|
#[cfg(feature = "inline")]
|
||||||
mod inline;
|
mod inline;
|
||||||
mod udiff;
|
|
||||||
mod utils;
|
mod utils;
|
||||||
|
|
||||||
pub use self::abstraction::{DiffableStr, DiffableStrRef};
|
pub use self::abstraction::{DiffableStr, DiffableStrRef};
|
||||||
#[cfg(feature = "inline")]
|
#[cfg(feature = "inline")]
|
||||||
pub use self::inline::InlineChange;
|
pub use self::inline::InlineChange;
|
||||||
pub use self::udiff::{unified_diff, UnifiedDiff, UnifiedDiffHunk, UnifiedHunkHeader};
|
|
||||||
|
|
||||||
use self::utils::{upper_seq_ratio, QuickSeqRatio};
|
use self::utils::{upper_seq_ratio, QuickSeqRatio};
|
||||||
|
use crate::udiff::UnifiedDiff;
|
||||||
use crate::{capture_diff_slices, get_diff_ratio, group_diff_ops, Algorithm, Change, DiffOp};
|
use crate::{capture_diff_slices, get_diff_ratio, group_diff_ops, Algorithm, Change, DiffOp};
|
||||||
|
|
||||||
/// A builder type config for more complex uses of [`TextDiff`].
|
/// A builder type config for more complex uses of [`TextDiff`].
|
||||||
|
|
@ -358,7 +271,7 @@ impl<'old, 'new, 'bufs, T: DiffableStr + ?Sized + 'old + 'new> TextDiff<'old, 'n
|
||||||
/// ratio of `0.0` would indicate completely distinct sequences.
|
/// ratio of `0.0` would indicate completely distinct sequences.
|
||||||
///
|
///
|
||||||
/// ```rust
|
/// ```rust
|
||||||
/// # use similar::text::TextDiff;
|
/// # use similar::TextDiff;
|
||||||
/// let diff = TextDiff::from_chars("abcd", "bcde");
|
/// let diff = TextDiff::from_chars("abcd", "bcde");
|
||||||
/// assert_eq!(diff.ratio(), 0.75);
|
/// assert_eq!(diff.ratio(), 0.75);
|
||||||
/// ```
|
/// ```
|
||||||
|
|
@ -411,7 +324,7 @@ impl<'old, 'new, 'bufs, T: DiffableStr + ?Sized + 'old + 'new> TextDiff<'old, 'n
|
||||||
/// to be considered similar. See [`TextDiff::ratio`] for more information.
|
/// to be considered similar. See [`TextDiff::ratio`] for more information.
|
||||||
///
|
///
|
||||||
/// ```
|
/// ```
|
||||||
/// # use similar::text::get_close_matches;
|
/// # use similar::get_close_matches;
|
||||||
/// let matches = get_close_matches(
|
/// let matches = get_close_matches(
|
||||||
/// "appel",
|
/// "appel",
|
||||||
/// &["ape", "apple", "peach", "puppy"][..],
|
/// &["ape", "apple", "peach", "puppy"][..],
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,10 @@
|
||||||
//! This module provides unified diff functionality.
|
//! This module provides unified diff functionality.
|
||||||
//!
|
//!
|
||||||
//! This module is available for as long as the `text` feature is enabled which
|
//! It is available for as long as the `text` feature is enabled which
|
||||||
//! is enabled by default.
|
//! is enabled by default:
|
||||||
//!
|
//!
|
||||||
//! ```rust
|
//! ```rust
|
||||||
//! use similar::text::TextDiff;
|
//! use similar::TextDiff;
|
||||||
//! # let old_text = "";
|
//! # let old_text = "";
|
||||||
//! # let new_text = "";
|
//! # let new_text = "";
|
||||||
//! let text_diff = TextDiff::from_lines(old_text, new_text);
|
//! let text_diff = TextDiff::from_lines(old_text, new_text);
|
||||||
|
|
@ -21,15 +21,13 @@
|
||||||
//! versions by using [`UnifiedDiff.to_string`] or [`UnifiedDiff.to_writer`].
|
//! versions by using [`UnifiedDiff.to_string`] or [`UnifiedDiff.to_writer`].
|
||||||
//! The former uses [`DiffableStr::to_string_lossy`], the latter uses
|
//! The former uses [`DiffableStr::to_string_lossy`], the latter uses
|
||||||
//! [`DiffableStr::as_bytes`] for each line.
|
//! [`DiffableStr::as_bytes`] for each line.
|
||||||
|
#[cfg(feature = "text")]
|
||||||
use std::ops::Range;
|
use std::ops::Range;
|
||||||
use std::{fmt, io};
|
use std::{fmt, io};
|
||||||
|
|
||||||
use crate::text::TextDiff;
|
use crate::text::{DiffableStr, TextDiff};
|
||||||
use crate::types::{Algorithm, Change, DiffOp};
|
use crate::types::{Algorithm, Change, DiffOp};
|
||||||
|
|
||||||
use super::DiffableStr;
|
|
||||||
|
|
||||||
struct MissingNewlineHint(bool);
|
struct MissingNewlineHint(bool);
|
||||||
|
|
||||||
impl fmt::Display for MissingNewlineHint {
|
impl fmt::Display for MissingNewlineHint {
|
||||||
|
|
@ -99,7 +97,7 @@ impl fmt::Display for UnifiedHunkHeader {
|
||||||
/// Unified diff formatter.
|
/// Unified diff formatter.
|
||||||
///
|
///
|
||||||
/// ```rust
|
/// ```rust
|
||||||
/// use similar::text::TextDiff;
|
/// use similar::TextDiff;
|
||||||
/// # let old_text = "";
|
/// # let old_text = "";
|
||||||
/// # let new_text = "";
|
/// # let new_text = "";
|
||||||
/// let text_diff = TextDiff::from_lines(old_text, new_text);
|
/// let text_diff = TextDiff::from_lines(old_text, new_text);
|
||||||
Loading…
Add table
Add a link
Reference in a new issue